MongoDB SF Conference Notes

CRUD in the JS shell
http://github.com/mdirolf/shell_presentation

Schema Design
4MB object limit
atomicity at the document level
... the rest of the talk isnt visible due to his slides/projector :(
the $ operator sounds cool, I need to research this
compare & swap is the safer and more appropriate pattern than just modifying a single value
Sharding should be considered when designing the schema
Capped Collection - a rrd(?) style. fixed size, will delete oldest records when the size limit is reached
    automatically stores insertion time and allows for queries based on that value

From Mysql to MongoDB
mongo loves system resources
    run on its own machine to keep from paging
    takes significantly more disk space than mysql
    disk speed is your bottleneck
mongo is faster than hibernate in java (woohoo?)
reduce disk usage by using shorter key names (veryLongAttributeName => vlan)

Mongomapper
custom types => DowncasedString (to_mongo/from_mongo)
gridfs + jnunemaker's joint plugin to store files
identity map plugin may help reduce queries but requires a rethink of how to use mongomapper
prophesying
    activemodel (when rails 3 is complete)
        validations, callbacks, dirty tracking, serialization, etc.
    blank document
        mongomapper w/o all the plugins (ie: more customization)
    mongo::query
        similar to ARel
Class in Michigan: ideafoundry.info/mongodb

Event Logging
map/reduce
counting in real time! (?)
use ruby to generate JS map/reduce code
mongo is really fast for map/reduce

Administration
log rotation commands built in

First Annual Software Craftsmanship North America Conference

I’m at SCNA today, and so far it’s pretty great. I know this post is short but I plan on eventually cleaning up my notes and posting them here later today or tomorrow. I promise!

*edit* As promised, here are my notes (maybe, just maybe I’ll write up a summary… later).

George Leonard - Mastery (Book)

Ken Auer - Swimming Upstream, ...
  Does a fish know he's wet?
  Google: "Sutherland Sketchpad"
          "intrapreneurs"
  Home schoolers vs Classroom schoolers
    apprenticeships
  Staying home everyday leaves you less challenged
    Leaving home everyday leaves your family less challenged
  rolemodelstudios.com <= family project/business
  integrated life - large home for family, extended family, and work
  Pragmatic Programmer
    Learn to Program (Yellow Belt)
    Agile Wed Dev w/Rails (Black Belt)

Michael Feathers - Self Education and the Crafstman
  Big O / Little O / Theta Notation
  Covariance & Contra-variance - substitutability
  Types - not just language constructs
  State Machines - the forgotten diagram
  Turing Machines
  The Halting Problem - limits on verifiability (?)
  Worse is Better - simplified design is easier to debug (but has less features)
  Redundancy is not Strength - reinforce a bridge
    the same specs to different teams produces a correlation in the bugs from each team
    "things only fall apart when we touch the code"
  Security on Sand - On Trusting Trust (paper)
  Location Transparency is a Myth
  Books - "Cool stuff to know"
    SICP - Structure and Interpretation of Computer Programs (http://mitpress.mit.edu)
    Syntropy - Designing Object Systems - Steve Cook & John Daniels
    Graph Theory - Graphs: Theory and Algorithms
    Compilers, Principles, Techniques and Tools
    Discrete Mathematics with Combinatorics - James A. Anderson
    Theory of Computation - Michael Sipser

Christopher Avery - Demonstrating Responsibility: The Mindset of -an Agile Leader- Crafstmanship
  Involved w/Agile ~ 2004 - Accidental Expert
  How You Respond to a Problem
    Who did this? Who's responsible?
    That doesn't happen when things go well
  How many time a day do things go wrong (people not attending meetings, or replying to emails)
  Problem => Denial => Lay Blame => Justify => Shame => Obligation => Responsibility
  Lay Blame - pointing fingers (humans do it and are good at it, coping mechanism)
    "You can get stuck there, or you can get off of it" - C. Avery
    Cause / Effect scenario
  Obligation - transient mindset, have to do it, don't want to, but we have no choice
  Responsibility only happens when you refuse to accept obligation
  Quit (transient mental state) can come between shame / obligation
    you've checked out because you refuse to accept responsibility
  Highly engaged customers predict revenue/stock/etc. increases
    only thru highly engaged employees is this possible
  Intention - the winning key
  Awareness - the change key
  Confront - the truth key

Jim Weirich - Grand Unified Theory of Software Design
  Way more than 4 forces in the software universe
    SOLID, Law of Demeter, DRY, Small Methods, Design by Contract, etc.
  Sheldon Jordan - RCA Missile Test Project - mentor to J. Weirich
    Composite Structured Design - Book
      Coupling & Cohesion
        Coupling - (less) None, Data, Stamp, Control, External, Common, Content (more)
  Connascence - 2 pieces of software share connascence when a change in one requires a corresponding change in the other
  Connascence of Name
  Connascence of Position
    Generally good to move from CoP to CoN (array vs hash)
  Connascence of Meaning (ex. 1 = true, 2 = false)
    use a constant to convert to CoN
  Connascence of Algorithm
    use DRY to convert to CoN
  Connascence of Timing (Race Condition) - Threading
    mutex's can protect against this problem
  Connascence of Execution - ordering of steps in an algorithm are important
  Connascence of Identity - duplicate objects (2 sql queries for the same object, AR::Base problem, DataMapper solves this with an identity map)
  Connascence of Value

Ward Cunningham - What if Bacteria Designed Computers?
  Dorkbot
  Invented his own wire protocol called Bynase
  Arduinos are the future
  Uses random numbers and electronic "noise" to communicate between processors

Dave Hoover / Paul Pagel - Apprenticing to Mastery
  Apprenticeship is the only way to achieve mastery
  Picaso wasnt a child genius, he apprenticed
  Open source projects are a great way to learn
  Great programming books generally dont have the name of a language in their title
  A master must also be a good teacher
    intuition is great, but without articulation it's not helpful

Bobby Norton - Test Driven Learning
  Start small, dont take on to much at once
  apprenticeship patterns, walk the long road
  Novice, Advanced Beginner, Competent, Proficient, Expert
  Experts adapt, create and advance the practice
    work from intuition, not reason
    don't need rules
    github.com/bobbyno/shubox

Bob Martin - Craftsmanship Under Pressure
  Holy shit, thermodynamics and crazy astrophysics
  The universe appears to defy the law of conservation of energy
  Estimation -> Manage Expectations
  QA shouldn't find any bugs
  close to 100% code coverage (I don't agree)
  dogmatic about TDD
  Changing code makes it changeable
    each refactoring strips out hard to change areas with better versions
  boy scout rule - leave it cleaner than you found it
  time between writing code and tests should be very very small (or negative: TDD)
  one week of overtime is ok, but 2 months is not
    you will do harm to your code, become complacent and stop caring
  professionals know how to have an uncomfortable discussion (ex: slipping release date or change in scope)
  "when i feel pressure i slow down"
  professionals have a work ethic
    not coding for their own joy (that should be a secret)
    will work for our employer to deliver value
    40hrs are the property of your employer (or customer)
      employer is not responsible for your education (books, conferences, etc.) if they do it's very nice
  "The core of this craftsmanship movement is developers taking responsibility for their own career"
  You should be able to list ALL the design patters, it's your profession!
  You should know many languages, only knowing 1 is a problem
  There is a lot of information that has been accumulated over the decades that you should know, it's your profession!
  Read the works of David Parnis (sp?) and his tables to describe how systems work
  Why can't anyone in this room write quick sort out of their head
  Continuous Learning - our profession has a tendency for the individuals to never stop learning
    we live to learn, and love to learn (at least we should)
  Jugglers and Musicians are over represented in SE's
    both take time to learn and master, but we love to learn
  Step outside your comfort zone
    language and methods included!
    broaden your zone

More notes to come at Octavity from a fellow attendee.

Arduino Nixie Tube Bar Graph Control With TLC5628CN

Check out these tubes, pretty bad ass, don’t you think? With the help of Daniel Naito’s amazing project and the TI TLC5628CN 8-bit serial DAC I’ve managed to get my Arduino controlling an IN-13 tube. I’ve got simple 3 wire serial communication working with the TLC5628 (thanks Ogi Lumen), and but I’m only utilizing one DAC right now. The nixie tubes on the left are facing up and therefore are hard to see, but they are cycling 0-9-0.

Steampunk Here I Come

As my previous posts show, I’ve been working with an Arduino and some Nixie Tubes. Don’t get me wrong, they’re pretty cool, but they don’t come close to IN-13 Bar Graphs. Yeah, neon tube bar graphs that glow orange, don’t even try to deny the badass nature of these.

Check out the circuit I put together to test these tubes out. No Arduino required, just a power supply, a couple of resistors, some pots and a capacitor. I’d love to take credit for designing this circuit, but I borrowed it from this super helpful project: Multi-band VU Meter. Specifically, the circuit I built can be found here.

Nixie Tube Bar Graph

Still want more? Check out this video I took with the iSight in my Macbook.

Some Arduino Photos and a Better Film

I gave a brief Lightning Talk today at work. It went well and the audience seemed genuinely interested and engaged. I’ve posted a 30 second youtube video of the device running on my desk and showing off a few of the commands over the serial port.

A few people have mentioned they have no idea what I’m talking about or what’s going on in those awful videos I’ve put on youtube. To help clear things up here are a couple of photos.

Arduino 4
With the flash it is hard to see the lit digit, but it is a 5 on the right. Arduino up top and Ogi Lumen power supply on the right.

Arduino 3
Same as above but without a flash and the five on the left.

Arduino 2
More of the same but with 00 on the tubes.

Arduino 1
You get the point, it displays digits, for example: 99.

Hopefully the pictures make it clear how simple the configuration is. No extra components are required, the Arduino wires directly to the Nixie Tube Driver and the power supply only has two wires. The code is also pretty simple at about 115 lines and heavily commented. I forgot how fun playing with electronics can be when it isn’t for a grade.

Playing with the Arduino

I might have mentioned previously that I recently bought an Arduino Duemilanove. It’s pretty damn cool and I’m amazed at the amount of power you can get for such small price (and form factor).

My first month or so was spent hooking it up to LED’s and the Ethernet Shield. I wasted a bunch of time on a poorly thought out messaging system. I had LED’s hooked up to PWM outputs and was controlling their brightness independently from my computer. It was a great learning experience and the perfect way to remind me how to write C. Luckily for us all, that code has been scrapped (it can probably be found at the first revision on Github).

This week I got a package from Canada. “What comes from Canada?” you might ask. Not much, but there is a place that loves to ship NOS Russian Nixie tubes. Ogi Lumen has been a pleasure to deal with, and provides a top quality kit. Assembling the Nixie Driver kit took about an hour, and getting it up and running was smooth as can be. Ogi Lumen provides a concise and well written library with only a handful of functions for the developer to learn.

Tomorrow I’m giving a lightning talk at work about all this. I’ve spent the better part of the evening putting together some code to help with a demo. I’ve cleaned up the messaging protocol a bit (it still needs major refactoring, but I’ve opted to keep it simple). Now I have a few different kinds of messages, as well as minor error handling. I’ve set up a simple way to clear the Nixie tubes, a command to left justify some digits (and continuously shift them to the right) as well as right justify the display (with no shifting). But my ace in the hole is a demo command I’ve put together that will do some cool cycling of content across the tubes.

Ok, so it’s still not all that exciting, eventually I’ll buy more than two tubes. First though, I have plans to conquer this beast: Neon Bar Graph. I’ve placed an order from Digi-Key for a few discrete components to get myself started, eventually building a shift register/DAC driver similar to the Ogi Lumen Nixie Tube Driver kits (probably without the cool boards though).

News Groups :(

Oh news groups, we really have a love hate relationship.

Why do many people chose never to send a final correspondence to let the others in the discussion know how things turned out? It’s really frustrating to try to offer help and have no idea if you were useful.

Also, why isn’t Google Groups more like a forum and less like a minor update to an aging system?

Rails Bug Found while Streaming Output in ActionController Tests

Last week I stole a ticket from my boss. It’s an interesting one involving streaming of XML data to another application. There are some issues with the concept and process, but it’s a legacy system and we can’t change it (other teams are busy building a replacement). We’ve also recently upgraded this project from Rails 2.2 to 2.3 and switched from Mongrel to Passenger. All this means I’ve been tasked with refactoring all of this streaming code from a custom Mongrel handler to a normal ActionController setup utilizing a render call with a Proc in the text attribute.

The XML streaming is actually a search API for our ETL system. It queries for very large chunks of data. The streaming requirement has a few positive side effects, it allows the extracts to take place quicker and it helps us run our queries and instantiate our models in batches, keeping our RAM usage under control. Check out the really simple example below.

def search
    response.content_type = Mime::XML

    render :text => Proc.new{ |resp, out|
      out.write ""
      1.upto(10){ |i| out.write "" }
      out.write ""
    }
end

I had the migrations from the Mongrel Handler to a regular controller mostly done so I decided it was time to get some tests together. I setup the first test, the simplest thing I could do was ask for the default set. It looked something like this:

test "default" do
    get :search
    assert_response :success
    assert_select "element", 10
end

I hit cmd-r in Textmate and something was wrong, I got the following exception: TypeError: can’t convert Proc into String. After about 45 minutes of debugging I found the source of my problem, line 16 of HTML::Document. After thinking it over, it seemed like the appropriate place to make the update was around line 490 in TestProcess. The update is pretty simple: check to see if @response.body responds to call, if so run it with a StringIO object, eventually passing its contents to HTML::Document.new. Check out my ticket and patch at lighthouse. Read it over, check out the patch, run the tests and give it a +1 if you don’t mind.

Get My Patch Added to Core

Help me out here. I came across what I think is a bug in Rails core. It’s a really small issue, and an even smaller patch. Seriously, it’s only two lines of code, and three new tests — as you can see in the Lighthouse Ticket.

Oh, you want a description? The description I wrote in the ticket wasn’t enough? Fine, I’ll expand a bit here. First, let me state that this issue was only uncovered due to our semi-unorthodox deployment strategy. I say semi-unorthodox because I’ve never read of anyone else using a similar style, but I don’t see an alternative.

Here at work we build apps for two sets of users: the public and our business team. These sets of users see totally different applications; in fact, they aren’t even running on the same boxes. Think of our internal tool as a CMS and Customer Service tool. Instead of writing two applications and making them share models, we have one larger application but can control how it reacts by changing the RAILS_ENV. Our basic configuration contains three environments: development, production and admin (this is a simplification: we also have multiple test environments, but they are unimportant for the sake of this discussion).

To limit what can take place on each environment we have segregated our routes.rb like you see below.

# back end services
if %w(admin development test).include?(RAILS_ENV)
    path_prefix = (RAILS_ENV == "development") ? "admin" : "")
    map.namespace(:admin, :path_prefix => path_prefix) do |admin|
      admin.connect "/applicants/:id", :controller => "applicants", :action => "show"
    end
end

# front end site
if %w(production development test).include?(RAILS_ENV)
    map.root :controller => "applicants", :action => "new"
end

Let’s go over this routing real quick. You’ll immediately notice two blocks, the first being for “back end services” (i.e.: internal business users), and the second for the “front end site” (i.e.: the public). If you look closely you’ll see the only difference between the conditionals for each block is that the first includes the “admin” environment while the second uses the “production” environment.

The issue I ran into stems from line #3. When path_prefix is blank, RouteBuilder generates improper URLs, therefore affecting everything inside the admin namespace.

  • When path_prefix is set, the admin URLs look like: /admin/applicants/1
  • When path_prefix is blank they look like: //applicants/1

That extra slash is a bit of a problem with the way we deploy our “admin” environment. We use some Apache fu to map http://admin/project to the root of the application, meaning our internal users need to go to http://admin/project//applicants/1. This URL is clearly incorrect and should never be generated.

Without the path_prefix we are severely limited in our ability to run the full project in development mode. With the path_prefix set in “development” mode we can view the public portion of the site at http://localhost:3000/applicants/1 and the private part at http://localhost:3000/admin/applicants/1. It should be noted that these two routes go to two completely different controllers, one being in the global namespace, and the other existing inside an Admin module.

Now I know if you payed attention you’re probably thinking: “Andrew, you can get around this with a simple refactor, maybe try this…”

path_prefix = (RAILS_ENV == "development") ? "admin" : nil)

Yes, you’re right, that would work: having the path_prefix set to nil is a simple solution. But, a bug is a bug, and adding an extra slash to a URL isn’t an appropriate thing to do. So please, if you made it this far and you don’t think I’m a raving loon then +1 my patch.

Am I about to become famous?

Nah, probably not, but I do have a little something to brag about. A certain Rick Olson has kindly accepted a second patch of mine. Like the first couple of commits he merged, this new one is pretty small. It’s a simple update to the MasterFilter class in his Masochism plugin.

Masochism is a pretty specialized plugin, but extremely useful, small and easy to work with. In short Masochism provides a simple way to use a master/slave database configuration from a Rails project. It does this by delegating to different ActiveRecord connections based on the type of SQL you are generating, with reads going to the slave and writes to the master.

It’s also nice enough to provide a few extras classes under the ActiveReload namespace.The first two are abstract ActiveRecord::Base classes, MasterDatabase and SlaveDatabase, that will force the models inheritting from them work entirely in the implied database. The third is MasterFilter, a class designed to be used as an around filter in a controller. When this filter is executed your models will make sure to do all reads and writes from the master database.

Rambling one post at a time