Blog

04. June 2014

Fast and slow databases

I’m not a Mongo guy. Some of it’s things seem neat, but I haven’t spent enough time with it to be able to recommend it as a Good Thing. I have spent a lot of time making fun of its ability to lose data. If I want a document store, I reach for CouchDB.

That said, I think this is less of a Mongo thing and more of a document store thing. The same criticism could easily be leveled at Couch.

Personally my approach to this issue is ‘just deal with the slow reads’. If your database server is getting put under too much pressure because each request is expensive, scale it horizontally. Being able to do that is the whole point.

If your application is super latency sensitive, use Redis for the bits you really need quickly. Better yet, use Redis for everything you can get away with because it’s lovely.

If neither of those are a good option for your application then you’ll need a different way of storing your data. Maybe Postgres with a bunch of read slaves or Postgres-XC or something.

There is definitely no shortage of wacky databases out there.

25. January 2012

The new Unity HUD is a trojan horse.

I am super excited about the new HUD coming to Unity.

http://www.markshuttleworth.com/archives/939

Quicksilver on the Mac is great, and this new HUD looks like a Quicksilver that also works inside application menus. That is flat-out fantastic and it may well pull me away from OSX.

While I'm super excited about it, I think the thing that makes it so powerful is going to be both an opportunity and a danger for it's continued existence.

What tools like the HUD and Quicksilver do is give us some of the flexibility and power that we get on the command line. For example, say I want to move all image files from my desktop into my Pictures folder. I laboriously click them all on the Desktop (remember it's only images I want, so I can't just select everything, and they're probably interspersed with other icons), then open Finder, pick out the folder I want from all the options in there, then drag all the highlighted files in there. During this process maybe I grabbed one or two files I didn't want, or I flubbed my aim with the mouse and they all ended up in Documents or something.

On the command line, it's as simple and elegant as:

cp ~/Desktop/*.jpg ~/Pictures/

In order to get that simplicity, elegance and power however, I had to pay a toll that many users are unfortunately unwilling to pay. I had to memorise a few things. I had to remember that cp is the copy tool, that ~ is shorthand for my home directory and that * is a wildcard. Basic bash that has repaid the time spent learning it many, many times over. Many people are intimidated by this barrier, however.

What makes it intimidating is that there are no hints to get you started. The command line is a completely blank slate before you've typed anything into it. If you don't even know to type help or use tab completion, you have no way of getting started and this freaks people out. The strength of graphical interfaces is that when people are lost it's easy for them to poke around. There's a whole menu of options they can spend time reading through until they find one that looks right.

What I hope is that the HUD introduces people to the power of text based commands. They can use them in the HUD and maybe move on to using them on the command line and learning more about the systems they use every day. I want it to serve a similar purpose as Hypercard did back in its day by easing people into an unfamiliar new territory. Just like Hypercard was a friendly looking trojan horse that actually taught people the basics of logic and programming, I hope the HUD is the trojan horse of the command line.

I'm afraid, however, that it will simply scare new converts away.

23. January 2012

Megaupload could spawn caselaw more destructive than SOPA

It's very early days in the case against Megaupload, and the Americans haven't fully revealed their strategy as yet. There is one part of their rhetoric that really worries me, however.

The full text of the indictment is available here.

The argument is that Megaupload are not eligible for the safe harbour provisions in the DMCA because they knew about infringing content, yet did not remove it. The evidence tendered is as follows:

On or about August 15, 2007, BENCKO sent VAN DER KOLK an e-mail message indicating “the sopranos is in French :((( fuck.. can u pls find me some again ?

So they knew they were hosting The Sopranos and they did not delete it. What they did not know, was that that copy of The Sopranos was not sanctioned by HBO or their French affiliate. It would, in this case, be a fairly reasonable assumption that it wasn't and that's why this example has been chosen. If this becomes precedent, however, we are in very dangerous territory.

The decision as to what is "obviously infringing" and what is not is incredibly murky. What if, for example, the French dialogue was a fansub. Is that sufficiently transformative that it would be considered fair use?

Say a book is uploaded. Who published it? Is it still within copyright? In what region was it created? It might have different lengths of applicable copyright. It might well have been uploaded by an independent author who wishes to gain exposure.

We recently had a video created for Pinion. When the final render was done, the artist placed it on Megaupload as it allowed files greater than 300mb with no hassle. This is the standard way he distributed completed work to clients. We went to download it on the day the raid was completed and got nothing.

This was a video on which we assert copyright. It is, by all definitions, a copyrighted work. And yet, it was placed on Megaupload and we were very annoyed when it was not available via Megaupload.

It is easy to generalise, in the vein of SOPA, that all these smart people working in tech should just make sure that no-one uploads copyrighted material to their services. It's easy right? If someone uploads a Hollywood movie just delete it! In reality, though, every time any file was uploaded, an extensive search would need to be conducted to determine whether, where, how, and by whom it was copyrighted. The rightsholder would then need to be contacted to determine whether or not the use was permitted. In the case of transformative or derivative works, the decision would have to be made (and the associated risk assumed!) by the service.

That burden can never be placed on those shoulders. It would be crippling for Google. It would be completely impossible for any startup out there and would stifle a massive amount of innovation.

SOPA merely (merely!) required that every link be checked against a blacklist provided by the US government. If the allegations in this indictment are allowed to stand, industry will not only have to enforce that blacklist, but create and curate it.

The Internet has taken up arms against SOPA/PIPA, but the fight is not just on that front. I am confident the EFF will file an amicus brief in this case. I fervently hope that it is enough.