Sunday, December 28, 2008

More On My Scala Query DSL

I know I promised to explain some things that I still haven't explained, but unfortunately I won't be doing that just yet. I've been too busy coding. I made some cool additions to what I do have though. Now, I can query over more things that just a dictionary of words. In this post I'll show some examples using music scales. Additionally, I added orderBy, and limit stuff, which was really fun.

I've added google code project for this as well, because I kept adding stuff to my cpusimulator project, and it clearly didn't belong. Here's the link: dslplayground.

Here's some examples:


1 val c_maj_and_a_min = contains(Chord("C", Maj), Chord("A", Min))
2
3 var scales_desc =
Scale find(c_maj_and_a_min) limit 4 orderBy (rootChord, Desc)
4 var scales_asc =
Scale find(c_maj_and_a_min) orderBy (rootChord, Asc) limit 3
5
6 scales_desc foreach println
7 println("-------------------------")
8 scales_asc foreach println

Here I'm searching for all Scales (just major and minor scales) that contain Cmaj, and Amin.

On line 3 I'm limiting my search to 4 scales, and ordering the results by the rootChord of the scale, descending.

On line 4 I'm limiting my search to 3 scales, and ordering the results by the rootChord of the scale, ascending. But, notice I've done this in the opposite order as line 3. (The behavior changes though, if I limit the unordered result set before I order it, or after...not sure what to think about that.)

Lines 6,7, and 8 yield:

List(Chord(G,Maj), Chord(A,Min), Chord(B,Min), Chord(C,Maj), Chord(D,Maj), Chord(E,Min), Chord(F#,Dim))
List(Chord(F,Maj), Chord(G,Min), Chord(A,Min), Chord(Bb,Maj), Chord(C,Maj), Chord(D,Min), Chord(E,Dim))
List(Chord(D,Min), Chord(E,Dim), Chord(F,Maj), Chord(G,Min), Chord(A,Min), Chord(Bb,Maj), Chord(C,Maj))
List(Chord(C,Maj), Chord(D,Min), Chord(E,Min), Chord(F,Maj), Chord(G,Maj), Chord(A,Min), Chord(B,Dim))
-------------------------
List(Chord(A,Min), Chord(B,Dim), Chord(C,Maj), Chord(D,Min), Chord(E,Min), Chord(F,Maj), Chord(G,Maj))
List(Chord(C,Maj), Chord(D,Min), Chord(E,Min), Chord(F,Maj), Chord(G,Maj), Chord(A,Min), Chord(B,Dim))
List(Chord(D,Min), Chord(E,Dim), Chord(F,Maj), Chord(G,Min), Chord(A,Min), Chord(Bb,Maj), Chord(C,Maj))


The fantastic thing about all this is the implementation. I've genericized the stuff I did with Scrabble in a really clean way, and to add search support for an already existing Scale library all I had to do was:


object Scale extends Search[Scale]{
def all = notes.map(MajorScale(_)) ::: notes.map(MinorScale(_))
def rootChord( s: Scale ) = s.rootChord
}

object ScaleMatchers {
def contains(chords: Chord*) = (s: Scale) => {
chords.foldLeft(true){ (b,c) => b && (s contains c) }
}
}

By extending the Search trait, I get all sorts of nice ability to search. All I need to do is provide the definition of the all function. Then, I just define some matchers (in this case only one), and I'm all set to start searching through my data.

Hopefully I'll be able to extract out some of the common matcher stuff as well. I don't think this will be all that difficult.

Its important to note a choice that I've made here. I'm forcing the definition of the all method. Basically, right now, every search loads all the data, and then works on it. Certainly this won't scale, but, this is just a simple little DSL. The underlying implementation could be optimized later. Doug Lea taught me that that is the way you should design languages. First, work on getting the language correct, then worry about performance.

Monday, December 22, 2008

Fun DSL Query Language for Scala

(Quick Note: If you just want to skip to the code, it's interspersed throughout the post. But, for your convenience, I've also put it all in one place at the bottom of the post.)

I've done it! Sick as a dog yesterday, I refactored the crap out of my Scrabble stuff, to create a really powerful query DSL. The kicker? It's only 24 lines of code! It allows you to do all sorts of crazily powerful queries. Lets take a look. I'm so excited! I can't stop typing exclamation points! I actually think this is the best code I've ever written. 1

I'd love to talk about the code smells of my last post, and how I ended up where I did, but to be honest, the code seems so much different now that it would be difficult. I'll just dive in.

Once again I'll start with the test code, which models the original requirements. Don't worry if you don't understand how it works just yet, because I'll explain later. For this first bit, I'm just going to try to explain what it does.

val dictionary = new Dictionary(getWords())
val tray = "defilnr"

// the 4 original requirements
val all = dictionary find includes_all(tray)
val any = dictionary find includes_any(tray)
val only = dictionary find includes_only(tray)
val all_and_only =
dictionary find (includes_all(tray) and includes_only(tray))

print(all, only, all_and_only)

Let me sum up what's going on here.
  • With "all", I've filtered the dictionary to find words that include all the letters in my tray. This produces a small list of very long words - something like: List(adenoliomyofibroma, antifederal, antifederalism, antifederalist, ... )
  • Similarly with "any", I've filtered the dictionary to find words that include any of the letters in my tray. This produces a gigantic list that I won't even begin to list here.
  • With "only", I've filtered the dictionary to find words that include only of the letters in my tray. This produces
  • a very large list - something like: List(d, d, de, dee, deed, deedeed, deer, defend, defender, defer, ... )
  • Finally, the important query, with "all_and_only", filtered the dictionary to find words that include words that include all the letters in my tray, and only the letters in my tray, just as the code reads. This produces something much more helpful to me in Scrabble: List(flinder, infielder)

A key thing to notice is the "and" on this line:

dictionary.filter(includes_all(tray) and includes_only(tray))

But like I said, I'll explain the hows later. Now I'll introduce a couple of new concepts with this line of code:

val all_or_min_10 = dictionary(includes_all(tray) or max_length(7))

The two new concepts on this line are:
  • "or", which is very similar to "and" and very self explanatory.
  • Notice I'm not calling dictionary "find" some_criteria, I'm just calling dictionary(some_criteria). I'll explain how this works later as well. I'm not actually sure I like it, but I figured I'd give it a shot to see how the code looks.

Time for more:

val xin_or_max_4 =
dictionary search (starts_with("xin") or max_length(4))
val van_or_min_4 =
dictionary search (ends_with("van") and min_length(4))

Some more new concepts on these lines are:
  • The "search" method is also an alias for find. But, the actual aliasing is pretty neat, so it will be cool to show how later.
  • Several new criteria functions - starts_with, ends_with, max_length, min_length, which I think are also very self explanatory in what they do.

If I'm moving quickly its because I'm really excited to get into the next part, where I build up a much more complex query.

Complex Queries


val complex_query =
(
(starts_with("d") and ends_with("ing")) or
(starts_with("b") and ends_with("ed"))
) and max_length(7) and min_length(4)

var result = dictionary find complex_query
print(result)

Yes! This seems to have really put it all together. In this query I'm asking for words that start with 'd' and end with 'ing', OR words that start with 'b' and end with 'ed', and any words found must have a maximum length of 7 and a minimum length of 4. I build up the query first, and then ask the dictionary to execute it for me. This seemingly innocent feature plays an important role in abstraction. Which hopefully we'll get to see in just a bit.

Executing this query gives me something like: List(babied, backed, bagged, baked, balled, banded, bangled, banked, barbed, barred, ... dueling, duffing, dumping, during, dusting, dyeing, dying)

Adding onto existing queries


But, I'm not satisfied with my results. Maybe I can't seen to find a place on the Scrabble board where an "a" can be used, and I don't have one in my tray. I want to re-execute my query, but remove all the a's. Here's how:

val complex_query_without_a = complex_query and includes_none("a")
result = dictionary find complex_query_without_a
print(result)

Fancy Schmancy! Executing this query gives me: List(bebed, bebled, bedded, beedged, beetled, beeweed, beinked, beliked, ... dueling, duffing, dumping, during, dusting, dyeing, dying)

Language Features


Before we go into the hows, lets take a very, very quick theory break. Taken straight from SICP, any language gives you three things:
  1. Primitive Expressions
  2. A means of combination
  3. A means of abstraction

Does this little DSL cover those criteria? Yes (though point 3 is somewhat shaky). Anyway, as we cover the hows, we'll refer back to those points to demonstrate how this DSL fulfills them.

Primitive Expressions


We're finally to the implementation (though I suppose many of you probably just scrolled down).

The query criteria methods are basically our primitive expressions in this DSL. 2 Here is the list of primitives provided, and their implementations. I suspect that other primitives will be added soon.

1 object WordMatchers{
2 def includes_only(s: String) = (t: String) =>
inverse(s).foldLeft(true)((b,c) => b && (! (t contains c)))
3 def includes_none(s: String) = (t: String) =>
s.foldLeft(true)((b,c) => b && (! (t contains c)))
4 def includes_all(s: String) = (t: String) =>
s.foldLeft(true)((b,c) => b && (t contains c))
5 def includes_any(s: String) = (t: String) =>
s.foldLeft(false)((b,c) => b || (t contains c))
6 def starts_with(s: String) = (t: String) => t startsWith s
7 def ends_with(s: String) = (t: String) => t endsWith s
8 def max_length(max: Int) = (s: String) => s.trim.length <= max
9 def min_length(min: Int) = (s: String) => s.trim.length >= min
10 def inverse( s: String ) =
"abcdefghijklmnopqrstuvwxyz".filter(!s.contains(_) )
11 }

Ok...I hope no one kills me when I say I'm going to save the how for this one for the next post. That's the implementation. It's pretty crazy, and can probably only be read by pretty experienced developers. I'm really tired and still not feeling great, so I'm going to wait so I don't mess it up.

Means of Combination


The means of combination in this DSL are very simple (at least from a users standpoint). They are simply and and or.

The implementation of those combinations are pretty complex though.

1 object Criteria{
2 implicit def funcToCriteria[T](f: Function1[T, Boolean]) =
new Criteria(f)
3 }
4 class Criteria[T](val f: Function1[T, Boolean]) {
5 def apply(t: T) = f(t)
6 def && (c: Criteria[T]): Criteria[T] = (t: T) => f(t) && c.f(t)
7 def || (c: Criteria[T]): Criteria[T] = (t: T) => f(t) || c.f(t)
8 }

Criteria is some seriously powerful magic, and is totally awesome. I'm also going to wait, and explain both the primitives, and combinations in the next post, which will probably be just tomorrow night. Maybe I'll just update it right here. Maybe I shouldn't even post this yet until I'm finished, but I know there are people that read my blog and this is too cool to wait on!

Execution


Before we take a look at our means of abstraction, lets take a quick look at the Dictionary class, which is something like the DSL's runtime environment, and VM. It holds the words (the environment), and executes the queries which search over those words (VM).

1 case class Dictionary( words: List[String] ) {
2 def apply(criteria: Criteria[String]) = words.filter(criteria(_))
3 def find = apply _; def filter = apply _; def search = apply _;
4 }

Its fairly simple, simply asking the Criteria to evaluate themselves, for each word in the dictionary. If the word doesn't fit the criteria, it is filtered out.

One thing that I really wanted to get to (which doesn't have a whole lot of importance, but is really cool) is the method aliasing. Check out line 3. I've aliased find, filter, and search all to apply, so that a user can call any of them. Maybe that's confusing and not needed, but it was fun.

Means of Abstraction


This DSL borrows its means of abstraction from the Scala language itself, but so far does a pretty poor job of it.

Let's revisit the very last example:

val complex_query_without_a = complex_query and includes_none("a")
result = dictionary find complex_query_without_a
print(result)

I intentionally left out the declaration of complex_query. Do you remember what it does? I suspect not. I could tell you that its a query "asking for words that start with 'd' and end with 'ing', OR words that start with 'b' and end with 'ed', and any words found must have a maximum length of 7 and a minimum length of 4. But, I shouldn't have to. Somehow the code should provide that information upon reading it. There are ways we could do this, but they aren't great, in my opinion.

We could say:

val starts_with_d_and_ends_with_ing_OR_starts_with_b_and_on_and_on_and_on =
(
(starts_with("d") and ends_with("ing")) or
(starts_with("b") and ends_with("ed"))
) and max_length(7) and min_length(4)

But, that really doesn't work. I mean, not only does the variable name go right off the stinkin' page, but the entire implementation is encoded into it. I think that its possible that this DSL fails at doing a great job of abstraction because the underlying nature of the thing its trying to represent - queries matching specific criteria - doesn't really lend itself to abstraction very well. If anyone has any ideas on how this could be improved, I'd be grateful.

OK....that's all for now. I know this thing sort of deteriorated at the end as I got tired, but I'll fix it up really soon. Hope you love it!

Implementation


As promised, here's all the code in one place:

package com.joshcough.scrabble

import Criteria._
import Dictionary._
import WordMatchers._
import org.testng.annotations.Test
import scala.io.Source

class ScrabbleTest {

val lines = Source.fromFile("/usr/share/dict/words").getLines
val words = lines.toList.map(_.trim.toLowerCase)
val dictionary = Dictionary(words)

@Test def tests {

val tray = "defilnr"

// the 4 original requirements
val all = dictionary find includes_all(tray)
val any = dictionary find includes_any(tray)
val only = dictionary find includes_only(tray)
val all_and_only = dictionary find
(includes_all(tray) and includes_only(tray))
print(all, only, all_and_only)



// additional cool things
// (use of Dictionary.apply and Criteria's "||" method )
val all_max_7 =
dictionary(includes_all(tray) and max_length(7))
val all_or_min_10 =
dictionary(includes_all(tray) or max_length(7))
print(all_max_7, all_or_min_10)



// more things (use of Dictionary.filter,
// starts_with, ends_with, max_length, min_length)
val xin_or_max_4 =
dictionary search (starts_with("xin") or max_length(4))
val van_or_min_4 =
dictionary search (ends_with("van") and min_length(4))
print(xin_or_max_4, max_4_or_van)



// more complex query
val complexQuery =
(
(starts_with("d") and ends_with("ing")) or
(starts_with("b") and ends_with("ed"))
) and max_length(7) and min_length(4)

var result = dictionary find complexQuery
print(result)



// hmmm, but i didnt like my result...
// how about i modify my query...
val complexQueryWithoutAs = complexQuery and includes_none("a")
result = dictionary find complexQueryWithoutAs
print(result)
}

def print[T](lists: List[T]*) = lists foreach println
}

/**
* @author dood
* Date: Dec 21, 2008
* Time: 10:09:24 PM
*/
object Criteria{
implicit def funcToCriteria[T](f: Function1[T, Boolean]) =
new Criteria(f)
}

class Criteria[T](val f: Function1[T, Boolean]) {
def apply(t: T) = f(t)
def and (c: Criteria[T]): Criteria[T] = (t: T) => f(t) && c.f(t)
def or (c: Criteria[T]): Criteria[T] = (t: T) => f(t) || c.f(t)
}

/**
* @author dood
* Date: Dec 21, 2008
* Time: 11:24:22 AM
*/
case class Dictionary( words: List[String] ) {
def apply(criteria: Criteria[String]) = words.filter(criteria(_))
def find = apply _; def filter = apply _; def search = apply _;
}

/**
* @author dood
* Date: Dec 22, 2008
* Time: 3:12:35 PM
*/
object WordMatchers{
def includes_only(s: String) = (t: String) =>
inverse(s).foldLeft(true)((b,c) => b && (! (t contains c)))
def includes_none(s: String) = (t: String) =>
s.foldLeft(true)((b,c) => b && (! (t contains c)))
def includes_all(s: String) = (t: String) =>
s.foldLeft(true)((b,c) => b && (t contains c))
def includes_any(s: String) = (t: String) =>
s.foldLeft(false)((b,c) => b || (t contains c))
def starts_with(s: String) = (t: String) => t startsWith s
def ends_with(s: String) = (t: String) => t endsWith s
def max_length(max: Int) = (s: String) => s.trim.length <= max
def min_length(min: Int) = (s: String) => s.trim.length >= min
def inverse( s: String ) =
"abcdefghijklmnopqrstuvwxyz".filter(! s.contains(_) )
}


(1) - I hate the NYSE for keeping me in an intellectual prison for so long, and hate myself for not leaving sooner.

(2) - However, this is an internal DSL, and so you are free to build up more and more primitives, which makes them seem not really like primitives at all...

Sunday, December 21, 2008

Using Scala First Class Functions By Name

WARNING: Today I changed this code all around, and I'll be reposting it soon. This post is still cool, and nothing is wrong with it, but, now the code is wicked awesome!!! So, look for it to be refreshed soon!

It occurred to me that while I knew really well how to pass anonymous functions around in Scala, I had very little experience passing around functions by name. I'm not sure why this is, but I think there isn't all that much out there written about it. It's important to understand that anywhere you could use an anonymous function, you could also use an existing function. Passing functions by name really helps clarify the intent of your code, and for DSL's in general. I'm going to try to demonstrate how to do so, for those who might be in the same boat as me.

I wrote a little app that searches through /usr/dict/words to find words matching the letters that I have in my Scrabble tray. Yes, I could have easily done this with "cat /usr/dict/words | grep someRegEx", but, the point was to learn more about functions.

(quick note...this post is really really long, but i think its full of good information)

First, I needed to be able to determine the following information about a word:
  • Does the word contain ALL the letters in my tray? (but maybe some other letters)
  • Does the word contain ONLY the letters in my tray? (no other letters allowed)
  • Does the word contain ANY of the letters in my tray?
  • Does the word contain NONE of the letters in my tray?
  • Lastly, does the word contain ALL of the letters in my tray, and ONLY the letters in my tray?

To do this I added a function to String that takes the characters to search for, and then any number of additional criteria. The criteria are functions that I reference by name. Lets see the client code, then I'll explain in more detail. Bear with me, I know its frustrating not seeing the actual library code, but I do think it helps to understand how its used before reading it.

1 class StringSearchTest {
2
3 import org.testng.annotations.Test
4 import PimpedString._
5 import Equalizer._
6
7 @Test
8 def testSearch {
9 var word = "ketoid"
10 word.searchFor("ke", word.includes_all) mustBe true
11 word.searchFor("ke", word.includes_none) mustBe false
12 word.searchFor("xzy", word.includes_none) mustBe true
13 word.searchFor("ke", word.includes_only) mustBe false
14
15 word = "keyyek"
16 word.searchFor("key", word.includes_only) mustBe true
17 word.searchFor("xyz", word.includes_any) mustBe true
18 word.searchFor("abc", word.includes_any) mustBe false
19 word.searchFor("key", word.includes_only, word.includes_all) mustBe true
20 word.searchFor("key", word.includes_all, word.includes_none) mustBe false
21 }
22}

Let's break down line a few lines:

10 word.searchFor("ke", word.includes_all) mustBe true
  • "word" is the word we are searching.
  • "ke" is what we are searching for.
  • "word.includes_all" is the criteria. This particular criteria states that 'word' must include all the letters we are searching for. In this case, word must contain both k and e.
  • The statement itself (word.searchFor("ke", word.includes_all)) returns a Boolean. True if the word matches the search criteria, false otherwise.

11 word.searchFor("ke", word.includes_none) mustBe false
12 word.searchFor("xzy", word.includes_none) mustBe true

includes_none is very much the opposite of includes all. It ensures that the word contains NONE of the things we are looking for. The statement in line 11 evaluates to false because the word does contain k (and e for that matter). Line 12 evaluates to true because they word does not contain x, y, or z.

16 word.searchFor("key", word.includes_only) mustBe true

This is nearly the same as the line 10, but the criteria is more restrictive. includes_only states that the word must include only letters that we are searching for. If we were searching for "key" in the word "hey", this would fail, because our word contains an "h".

17 word.searchFor("xyz", word.includes_any) mustBe true

Also very similar to above, but includes_any evaluates to true if the word contains ANY of the things we are searching for.

20 word.searchFor("key", word.includes_only, word.includes_all) mustBe true

Finally the moment of truth, line 20 is the most important of all. It states that the word must contain ONLY the letters in my tray, and ALL of the letters in my tray. This is the easiest way to find Bingo words. (It doesn't yet work perfectly, if my tray contained "key" and the word was "keyyyy", it would still evaluate to true. I need to fix this.)

Finally, before we see the actual library code, lets take a second to understand what "word.includes_all" actually is. "word.includes_all" is a higher order function that takes an Array of Char, and returns a function that takes a Char and returns a Boolean. A higher order function is a function that takes a function as an argument, returns a function, or both. includes_all does both. Higher order functions have been explained by everyone and their brother since 1960, so I won't be doing that here. I'm just explaining how to pass functions by name.

Ok, lets take a look at the library code and I'll explain in more detail.

1 object PimpedString {
2 implicit def pimpMyString(s: String): PimpedString =
3 new PimpedString(s)
4 implicit def convertToCharArray(s: String): Array[Char] =
5 s.toCharArray
6 }
7
8 class PimpedString(s: String) {
9
10 def includes_all(chars: Array[Char]) =
11 fold(chars)(s contains _)
12 def includes_none(chars: Array[Char]) =
13 fold(chars)(!s.contains(_))
14 def includes_any(chars: Array[Char]) =
15 chars.foldLeft(false)((b,c) => b || (s contains c))
16 def includes_only(chars: Array[Char]) =
17 includes_none(inverse(chars))
18
19 private val letters = "abcdefghijklmnopqrstuvwxyz".toCharArray
20 private def fold(chars: Array[Char])(f: Char => Boolean) =
21 chars.foldLeft(true)(_ && f(_))
22 private def inverse( chars: Array[Char] ) =
23 letters.filter(! chars.contains(_))
24 def searchFor(chars: Array[Char],
25 criteria: Function1[Array[Char], Boolean]*): Boolean = {
26 criteria.foldLeft(true){(b, f) => b && f(chars)}
27 }
28 }

(quick note: I've written about implicits here, and here, and I've written about foldLeft here, and I won't be covering them in any detail here.)

Let's just skip right ahead to line 10:

10 def includes_all(chars: Array[Char]) = fold(chars)(s contains _)

As advertised, includes_all is a higher order function that takes an Array of Char, and returns a function that takes a Char and returns a Boolean. Let's take a look back at the original code to see how it was used:

10 word.searchFor("ke", word.includes_all) mustBe true

Now, I could have just used an anonymous function here for my criteria. Assuming that the fold method wasn't private, it would look like this:

10 word.searchFor("ke", word.fold(chars)(s contains _)) mustBe true

The call to fold is basically saying look at each of the search characters and make sure I contain it. fold also returns a function that is executed in search. I could do this, but since includes_all is so common, so reusable, that instead of passing it anonymously, I give it a name, and reuse it all over. And, giving it a name makes it much, much more clear as to what is going on. The second version of line 10 is almost indecipherable.

That basically sums up how to reference functions by name, and pass them around. I'm going to finish the post, showing the actual Scrabble code. However, I don't think there are any ideas that I haven't already covered. Let's start by reviewing the original requirements, since its been a while:

Find all possible words where:
  • The word contains ALL the letters in my tray. (but maybe some other letters)
  • The word contains ONLY the letters in my tray. (no other letters allowed)
  • The word contain ANY of the letters in my tray. (other letters allowed)
  • The word contains ALL of the letters in my tray, and ONLY the letters in my tray.

Now to be consistent, lets look at the test/client code, because that's where you should always start.

1 object ScrabbleTest extends Application{
2 val tray = System.getProperty("tray").toString
3 val scrabble = Scrabble(tray)
4
5 println(scrabble.find(scrabble.all_letters_in_tray))
6 println(scrabble.find(scrabble.only_letters_in_tray))
7 println(scrabble.find(scrabble.any_letters_in_tray))
8 println(scrabble.find(
9 scrabble.only_letters_in_tray,
10 scrabble.all_letters_in_tray))
11 }

I think I have to take some pride in this. The code reads very, very close to the actual requirements. So much that I'm not going to explain it, except to point out again that scrabble.all_letters_in_tray is another example of passing a function by name.

Let's just go right to the library code.

1 case class Scrabble(tray: Array[Char]) {
2 val lines = Source.fromFile("/usr/share/dict/words").getLines
3 val words = lines.toList.map( _.toLowerCase )
4
5 def only_letters_in_tray(s: String): Function1[Array[Char], Boolean] =
6 s.includes_only
7 def all_letters_in_tray(s: String): Function1[Array[Char], Boolean] =
8 s.includes_all
9 def any_letters_in_tray(s: String): Function1[Array[Char], Boolean] =
10 s.includes_any
11
12 def find(criteria: Function1[String, Function1[Array[Char], Boolean]]*) = {
13 words.filter( word => {
14 word.searchFor(tray, criteria.map( _(word) ):_* )
15 }).map(_.trim)
16 }
17 }

Scrabble defines its own criteria functions - only_letters_in_tray, all_letters_in_tray, and any_letters_in_tray. But they are essentially just wrappers around the String criteria functions. This is a testament to how nice and reusable those original functions were, and why I luck out and don't have to explain them :)

The find here simply takes N criteria, and passes wrapped functions onto the String searchFor method. Line 14 is pretty insane, especially criteria.map( _(word) ):_*. It certainly takes some experience before being able to read (or write that line). Fortunately, as a user, the code is absurdly simple, so it doesn't really matter.

That's all for now. I hope this helped.

Anyone want to play me in Scrabble?

Saturday, December 20, 2008

Ruby over Scala

As promised, here is where I'll be maintaining the things I like about Ruby better than Scala. In general, I do think I prefer programming in Scala, but certainly some of the things I can do in Ruby are much easier. If you're looking the other post, its here: Scala over Ruby.

This list is my no means complete and as with the other list, I'll be appending to it here as I learn more.

Pimp My Library


I'll dive into some general theory about this first, but for those who just want to see the code, just scroll down.

There's been so many times when I've needed to add methods to classes, for good reasons. Well, I suppose I didn't "need" to, but the main good reason of course is readability. Having your data and operations on that data in one place is nice. Calling those operations consistently is very important.

Let's use an example in Scala. I couldn't find a "compact" method on List. It like seems it should be there. It's certainly useful. I could just create a method called compact, which takes a List, like so:

def compact[T]( list: List[T] ) = l.filter( _ != null )

But as I just said, calling operations on your data consistently is important for readability. Having this method forces me to call my operations differently. It's confusing calling some methods as infix, and some as prefix. Sometimes I say myList.whatever, and other times I say whatever(myList). The situation is even worse if that method takes arguments. myList.whatever(x), or whatever(myList, x). Inconsistent, confusing, ugly.

One solution is to simply add that method to the class you're operating on. Scala and Ruby both provide mechanisms for doing so, but they differ greatly. In Scala, I have to provide an implicit conversion from the class I'm operating on to a class that has a the method that I want to call. Here is the code:

class Compactable[T](l: List[T]){
def compact = l.filter( _ != null )
}

implicit def compactableList[T]( l: List[T] ) = new Compactable(l)

val list = List( 1, null, 2, null, 3, null )

println(list) // prints all values including the nulls
println(l.compact) // prints just the values 1, 2, and 3

This isn't terrible, but it really clouds my actual intent: I just want to add a compact method to List. Why should I have to convert it to something else, only to have the compiler figure out what I really wanted? While there are some other great uses for implicit conversions, in this particular case something that should be very simple has been made overly complex. Maybe there is a better way of doing this in Scala but I haven't seen it. In addition, I've seen this pattern so many times in Scala code written by other people, so I don't think there is a different way. This code is really just boilerplate.

In Ruby things are much more straight forward.

module Enumerable
def collect_with_index
idx = -1
collect{|elm| idx = idx + 1; yield(idx, elm)}
end
end

(Note: I've used a different method because the compact method already exists. I couldn't find a collect_with_index method.)

Here, I essentially just open up Enumerable and add the method to it. Done. The intent is preserved. Everything is clear, and simple. So simple in fact, that I don't even think I need to explain it.

I could imagine something very similar existing in Scala.

extend trait List[T]{
def compact = filter( _ != null )
}

This doesn't violate encapsulation because I'm not accessing any internals of List that aren't available to me otherwise. This could just be syntactic sugar for the Scala code above. Maybe someday.

Also, I think C# has similar mechanisms, but I'm admitting my inexperience. I'd love to see replies on how this is done in other languages.

Classes are Objects


In Scala, and Java, classes aren't really objects, at least the way they should be. Ruby (and I'm sure Smalltalk, and other languages) gets this right.

I had a discussion with my friend the other day and he asked, "what do you mean? I can call getClass on an object, and call methods on that class...". Eh, you can, but its still a weird fabrication. To a Rubyist, its laughable. So, without further ado, here's an example for Java programmers (in Java because Scala doesn't have static methods)

Let's say I have a simple little interface for a factory that creates Connection objects, and an implementation.

interface ConnectionFactory{
Connection createConnection( String user, String pass );
}

class ConnectionFactoryImpl implements ConnectionFactory{
Connection createConnection( String user, String pass ){
return new WhateverConnection( user, pass );
}
}

Then I have a client that uses a ConnectionFactory:

class ConnectionFactoryClient{

ConnectionFactory factory;

ConnectionFactoryClient( ConnectionFactory factory ){
this.factory = factory;
}

void doIt(){
Connection c = factory.createConnection( "dood", "secret" );
c.connect();
}
}

Now, imagine I had a class that had createConnection as a static method like so:

class ConnectionFactoryClass{
static Connection createConnection( String user, String pass ){
return new OtherConnection( user, pass )
}
}

You can see that the class itself seems like an object that implements the ConnectionFactory interface. If classes were really first class objects, and the class objects themselves could implement interfaces, then I could pass this class into the client like so:

new ConnectionFactoryClient( ConnectionFactoryClass )

You can do this easily in Ruby. This is because classes in Ruby are objects that respond to messages. They aren't really much different than the instances they create, just that the instances respond to a different set of messages the class.

I'm not advocating static methods here. This is something different entirely. Because the class is an object, and can be swapped out for a different object, you're not really statically bound to using that class the same way you would be in Java.

If classes could implement interfaces, you wouldn't have to go through the pain of creating a Factory instance, and passing that in, like this:

ConnectionFactory cf = new ConnectionFactoryImpl()
new ConnectionFactoryClient( cf )

Why go through the pain of creating a new class, and a new instance of a class, when you could use the class object itself. For the most part, classes are factories anyway.

Consider the following common pattern:

class MySQLConnection{
static public void createConnection( String user, String pass ){
new MySQLConnection( user, pass );
}

private MySQLConnection( String user, String pass ){ ... }
}

All the create logic is in one place, and the class is the factory. This, I think, is a good thing.

How could this be done you ask? To make this work with static typing, the language would have to have some construct that allows class objects to implement interfaces. It could look something like so (purpose any syntax you wish):

class MySQLConnection
(class implements ConnectionFactory)
(instances implement Connection){

static public void createConnection( String user, String pass ){
new MySQLConnection( user, pass );
}

private MySQLConnection( String user, String pass ){ ... }
}

We could probably take this even further to allow the constructors to serve as the implementations of the interface methods (maybe by giving constructors aliases).

One final note, in Scala, this would eliminate the need for the top level object that gets created when you define a case class. The class itself could be the object, and it would contain the apply method.

Anyway, that got long. I need to do something to structure this list a bit better.

Wednesday, December 17, 2008

Scala over Ruby

In this post, I'll be maintaining a list of things I like more about Scala, over Ruby. I am also maintaining a similar, opposite list - Ruby over Scala. I plan on maintaining this list here and adding more to it over time, as opposed to several posts. I'm not exactly sure how that works with feeds and things, but oh well.

Also, if at any point I'm totally wrong (which is very possible because I'm new to Ruby), please let me know! It's likely that I just didn't know I could do something in Ruby. I'm eager to learn, and for new ideas, and I'll be more than happy to update the post. Anyway, Onward...


1. Scala doesn't make you use the "." operator when calling methods.

This one I'll try to explain using some simple examples. Here is some Ruby code:

x = 5
y = 6
z = x + y
z2 = x.+y

class ClassWithPlusMethod
def +(other)
# do something, doesnt matter what...
return "whatever"
end
end

x = ClassWithPlusMethod.new
y = ClassWithPlusMethod.new
z = x + y
z2 = x.+y

class SomeOtherClass
def plus(other)
# do something, doesnt matter what...
return "whatever"
end
end

x = SomeOtherClass.new
y = SomeOtherClass.new
z = x plus y

(irb):25: warning: parenthesize argument(s) for future version
NoMethodError: undefined method `plus' for main:Object
from (irb):25

z2 = x.plus y # this is ok!

As you can see, Ruby has magic handling of + mathematical "operator" (and others). "+" is a simple method call, but since it's name is +, Ruby doesn't require the dot, its interpreter handles it differently. This magic is built into the language, and does not extend to methods with regular names, as shown above with "plus".

In Scala, the . operator is not mandatory on any method call. This may seem trivial, but it is not. It helps in creating cleaner DSL's. Take for example, my post on Equalizer.

In Scala, I was able to add the mustBe method to Any, and call it very nicely. But, in Ruby, I had to put in the dot. Contrast these examples:

x mustBe 5

vs

x.must_be 5


2. Method overloading is good.



I use method overloading a lot. An awful lot. I love it. I don't think I need to explain it however.

3. Scala's apply method.



Scala's apply method is something that I'm pretty sure just doesn't exist at all in Ruby, and I'm not sure it can be done easily. For those who don't know what it is, I'll explain, and maybe some Rubyists can show some similar examples. Given that I'm still not a Ruby expert, I could be totally wrong. If so, I would like to know.

For any method called apply, ".apply" can be omitted from the method call. For example given the following definition of the class Array:


class Array{
def get(index:Int) = { ...some code to get from the array... }
def apply(index:Int) = get(index)
}

And an instance of that class:

val a = new Array(whatever)

Then the following calls are essentially equivalent:

a.get(7) // though only because apply calls get
a.apply(7)
a(7)


This is really useful in cleaning up syntax. However, the real beauty of it is hidden below the surface, and will be the subject of a future post.

4. Ruby faking keyword parameters to methods



Calling methods with keyword params is nice for client code readability. For example:

c = httpConnection( :host => "google.com", :port => 80 )

However, in Ruby the niceness ends at the client code. The library code to support this is abysmal. The reason being, for the method using those parameters, its not at all obvious what the method requires. You have to dig down into the implementation to find out what it actually pulls out of params. This is not ok with me. When I look at a method signature, I should be able to understand what dependencies the method has. Example:

def httpConnection(params)
...
...
port = params[:port]
...
...
host = params[:host]
end


5. 1 vs. 3 line method defs.



I run into this one a lot, and find that Ruby code ends up being a lot larger than Scala code for this reason. In Ruby, I can't define a method on one line without it looking simply terrible. Example:

def add7(x)
x + 7
end

or

def add7(x); x + 7; end

The first version is 3 lines when it doesn't need to be. The second version is one line littered with noise.

Here's the much simpler Scala version:

def add7(x:Int) = x + 7

While this entire point might sound trivial, its not. Functional style encourages us to write many small methods. Ruby code grows larger than Scala code quickly. The problem also becomes bad when using nested functions. Ruby:

def add7(x)
def add3
x + 3
end
add3 + 4
end

Scala:

def add7(x:Int) = {
def add3 = x + 3
add3 + 4
}


All of this might seem rather trivial, but in practice it definitely adds up.

Tuesday, December 16, 2008

Coming Soon

For people that care (probably just myself, but I hold out hope), I'm going to spew out a list of the things I'm working on, posts that can be expected in the near future, and other random crap.

  1. The project I was working on was indeed cancelled, as predicted. In hindsight, I'm coming out looking pretty smart. I expect to write some more of my true feelings on this situation. One sad thing of course is that my good friends and old coworkers may soon have to look for new jobs. I hope not.
  2. Ruby is nice. I won't say that it's not as nice as Scala, as I may be attacked by the very vocal Ruby community. But I will say that I enjoy writing Scala much more (I do plan to expand on why, but this isn't the right time). However, some of the dynamic things you can do with Ruby are fascinating.
  3. Rails is nice as well, though I think it has its flaws. I hope to learn Lift very soon and do a nice comparison. We'll see if I have enough time.
  4. Today I started reading about building Facebook apps in Rails. This is something I'll likely dedicate a lot of time to.
  5. Yesterday I learned a bunch about iPhone development. I haven't written anything yet, just had a big walk-through of a pretty popular application. At first glance it seemed fantastic as Apple has made creating views quite simple. This is great because I'm terrible at any visual stuff. Anyway, this is the thing that I will probably end up dedicating the most time to; mostly because I still hold out hope for entrepreneurship.
  6. I am loving my new job. Leaving NYSE has turned out to be a fantastic decision. My new team is great. I'm re-energized. I'm challenged. I'm actually learning at work, instead of just at home. Learning is encouraged, not shunned. I love it.
  7. Lastly, I told myself about a year ago that I wouldn't be getting back into web development. This has changed for a few reasons:

    1. I will still never ever ever try to make things look nice.
    2. It seems like the frameworks available today eliminate most of the rest of the crap that I didn't want to do before.
    3. Having immediate access to millions of users on Facebook seems really exciting, eliminating more crap that I was never any good at - getting people to come to my site.
    4. Most importantly, I have an opportunity to learn new languages, and obviously that's what I really care about