Latest Tweets:
via www.bajoodoo.com
"MapReduce:
Simplified Data Processing on Large Clusters Jeff Dean, Sanjay Ghemawat
Google, Inc."
"MapReduce ist ein von Google Inc. eingeführtes Framework für nebenläufige Berechnungen über große (mehrere Petabyte[1]) Datenmengen auf Computerclustern. Dieses Framework wurde durch die in der funktionalen Programmierung häufig verwendeten Funktionen map und reduce inspiriert,[2] auch wenn die Semantik des Frameworks von diesen abweicht.[3] MapReduce-Implementierungen wurden in C++, Erlang, Java, Python und vielen anderen Programmiersprachen realisiert."
"The task of the transport factory is to create an object,
which uses the client configuration for talking to the
server. For example, there is a transport factory, which
uses the java.net classes. Another example is a transport
factory based on the Jakarta Commons Http Client. However,
transport factories don’t need to use HTTP: An excellent
example is the local transport factory, which talks to an
embedded server. This last factory is, of course, very
useful for debugging."
"View Snippets This page collects code snippets to be used in your Views. They are mainly meant to help get your head around the map/reduce approach to accessing database content. Keep in mind that the the Futon web client silently adds group=true to your views."
"I’m checking in a patch that should cut down on the number of mailing
list questions asking why a particular reduce function is hella slow.
Essentially the patch throws an error if the reduce function return
value is not at least half the size of the values array that was
passed in. (The check is skipped if the size is below a fixed amount,
200 bytes for now). I expect this heuristic will need fine tuning. Ideally we’d never
raise the error on “good” reduces, and always raise it on
“ill-conceived” ones. If you hit the error on a reduce that seems like
it should be considered good, please ping the list so we have an
opportunity to fine-tune. There is a config option: [query_server_config]
reduce_limit = true"
"Because people keep asking me about a Combiner function (which is written about in the Google Map/Reduce paper and also used in Hadoop), here’s a follow up to yesterdays post. The reason I’m not talking about about a combiner function is because I’m pretty sure the design I propose is isomorphic to a Map/Reduce/Combiner combo design. That is to say it’s the same damn thing, just re-jiggered a little. To illustrate, here is an example that does it’s own combiner operations inside the reduce. Let’s say we have a database full of receipts and we want to calculate the total spent in a month and the average amount of each purchase."
"I’ve heard the arguments for and against database abstraction layers (DALs) time and time again. I must say first, I agree with them all, both sides, equally. Interestingly, I can put the vocal proponents of each side of the argument in one of two boxes: a programmer guy box, or a database guy box. For some unknown reason though, they never seem to see eye to eye."