Category Archives: web development

GWT: 0 to 60mph in no time, Chris Ramsdale

JBossWorld Boston notes, GWT: 0 to 60mph in no time, Chris Ramsdale

From 25000ft
- toolkit, not fframework
- code in java, run as JavaScript
- one codebase, any browser
- makes AJAX a pice of cake and faster
- used in many Google projects like Google Wave and AdWords

Family
- SDK, compiler, generator
- eclipse plug-in
- speed tracer

Focus on users
- Our users, Developers
– leverage existing IDEs and tools
– minimize refresh time between code changes
– automate where possible
- Your users, Customers
– minimize startup time
– make it a comfortable experience
– allow them to select the browser

Java to JavaScript compiler, right?
- compiling to JavaScript instead of compiling to Assembler?
- You can pretty print if needed

From code to deployment

Generators
- provide power behind your GWT app

Ajax helper

Creating UIs

Goals
- utilize common devevelopment practices
- minimize boilerplate code
- remove a few frustrations along the way

uiBinder XML with

Linker
- different ways of loading the JavaScript

Use “-gen” to display the generated code

Tips and tricks
- Reduce optimizations, reduce compile time
– -draftCompile : skip all optimizations, development only

Optimize for user
- bundle resources
- split code

interface Bla extends ClientBundle {
    @Source("Contexts.css)
    public ContactCss contextCss()

    @Source("Contexts.gif)
    public ClientImage contextImage()
}

insert runAsync for code splitting

@UiHandler("showImagesButton")
void onOkClicked(ClickEvent event) {
    GWT.runAsync(new RunAsyncCallback() {
        public void onSuccess() {
            showImagesDialog()
        }
    }
}

“direct” approach
Write a bunch of widgets with self-contained logic
- hard to test – need GWTTestCase
- mocks not encouraged – harder to write smaller tests
- platform specific UI code – limits code reuse
- Too any dependencies, difficult to optimize

MVP approach
Cast
- model – DTOs and business logic
- View – the display
- Presenter – application logic
Goals
- be practical
- avoid rigid patters
- put complex login in presenters

MVP vs MVC
- C = view contains event logic
- P = render logic + event logic, separate view

You only have to test the model and presenter, rest is GWT itself and tested by Google :-)

Technology interoperability
- Seam
- JSF
- G4JSF

Making the cloud a reality

BeJUG talk, NoSQL with Hadoop and Hbase, Steven Noels

Notes are a little bit cryptic, but still…

NoSQL with HBase and Hadoop, Steven Noels, Bejug 17.06.2010

Intro

“An evolution drive by pain”
Various types of databases, standardized to RDBMS, further simplified to ORM frameworks

We are now living in a world with massive data stores, with caching, denormalization, sharding, replication,… There came a need to rething the problem, resulting in NoSQL.

Four trends:
- data size, every two years more data is created than existed before
- connectedness, more and more linking between data
- semi-structure,
- architecture, from single client for data, to multiple applications on data (make the db an integration hub), to decoupled services with their own back-end (not mentioned, but the next step will be integration of the back-ends)

Data management was a cost (hardware, DBA, infrastructure people, DB licenses,…)
Moving to considering data as an opportunity to learn about your customers, so you should capture as much as you can.

It is a Cambrian explosion (lot’s of evolution/new species, but only the tough/best will survive):
HBase, Cassandra, CouchDB, neo4j, riak, Redis, MongoDB,…

Some solutions may no longer exist in a couple of years, and some will become better and popular.

Common themes:
- scale, sscale, scale
- new data models
- devops, more interaction between developers, dba, infrastructure
- N-O-SQL, not only SQL
- cloud: technology is of no interest any more

New data:
- Sparse structures
- weak schemas
- graphs
- semi-structures
- document oriented

NoSQL
- not a movement
- not ANSI NoSQL-2010, there is no standard and it not expected there soon will be
- not one size fits all
- not (necessarily) anti-RDBMS
- not a silver bullet

NoSQL is pro choice

Use NoSQL for…
- horizontal scale (out instead of up)
- unusually common data (free structured)
- speed (especially for writes)
- the bleeding edge

Use SQL/RDBMs for…
- SQL
- ACID
- normalization
- a defined liability

Theory

See also Google Bigtable and Amazon Dynamo papers, Eric Brewer’s CAP theorem
discuss NoSQL papers : nosqlsummer.org

Dynamo: coined the term “eventual consistency”, consistent hashing
Bigtable: multi-dimensional column oriented database, on top of GoogleFileSystem, object versioning
CAP: you can only have two out of three of “string consistency”, “high availability”, “partition tolerance”

Difference between ACID (rdmb, pessimistic, strong consistency, less available, complex, analuzable) and BASE (availability and scaling highest priority, weak consistency, optimistic, best effort, simple and fast)

Hadoop: HDFS + MapReduce, single filesystem and single execution space
MapReduce is used for analytical and/or batch processing
Hadoop ecosystem: Chukwa, HBase, HDFS, Hive, Mapreduce, Pig, ZooKeeper,…
Benefit or parallellisation, more ad-hoc processing, compartmentalized approach reduces operational risk

Technology

Types:

  • key-value stores

    focus on scaling huge amounts of data

    Regis
    - vmware
    - very fast but mostly one server

    Voldemort
    - LinkedIn
    - persistent distributed
    - fault-tolerant
    - java based

  • column stores

    BigTable clones
    sparse tables
    data model: columns->column families->cells

    BigTable

    HBase
    - Stumbleupon, Adobe, Cloudera
    - sorted
    - distributed
    - highly-available
    - high performance
    - multi-dimensional (timestamp)
    - persisted
    - random access layer on HDFS
    - has a central master node

    Cassandra
    - Rackspace, Facebook
    - key-value with added structure
    - reliability (no master node)
    - eventual consistent
    - distributed
    - tunable partitioning and replication
    - PRO linear scale, write optimized
    - CON 1 row must fit in ram, only pk based querying

  • document databases

    Lotus Notes heritage
    key-value stores but DB knows what the value is
    documents often versioned
    collections of key-value collections

    CouchDB
    - fault tolerant
    - schema-free
    - document oriented
    - RESTful HTTP interface
    - document is a JSON object
    - view system is MapReduce based, Filter, Collate, Aggregate, all javascript
    - out-of-the box all data needs to fit on one machine

    MongoDB
    - like CouchDB
    - C++
    - performance focus
    - native drivers
    - auto sharing (alpha)

    Riak

  • graph databases

    data is nodes + relationships + key/value properties

    neo4j
    - mostly RAM centric
    - SPARQL/SAIL implementation
    - scaling to complexity (rather than volume?)
    - ‘whiteboard” friendly
    - many language bindings
    - little remoting

devoxx 2009 recap

I have spent the entire week at devoxx. This is probably the largest java (or JVM) conference in Europe. As always at conferences, there was a lot to see, and I came back with lots of ideas.

Some ideas and notes from the most interesting sessions I saw :

  • University : “jBPM4 in action” by Tom Baeyens and Joram Barrez This was an interesting look at the progress which is made in jBPM. The did a live show of doing test first development using jBPM. I have had the feeling that I want to use jBPM in my projects for quite a while now, and this feeling was only re-enforced. An interesting find was the use of wiser (an in-memoty mail server) for testing e-mail integration. For the new features of jBPM itself, the signavio BPM modeller and the new task forms (using freemarker templates) are good to know.
  • NoSQL BOF Wow, an eye-opener BOF which tried to build a round-table discussion about experiences with NoSQL systems. There was mostly one guy (forgot his name) sharing his experiences. The message I took home is that NoSQL can result in a massive though still fast system. Though of course with the limitation that normal SQL queries don’t work. However, scanning the database to retrieve data is much more efficient than you would expect (removing the disk as bottleneck really works).
  • “Examination timetabling with drools solver” BOF by Geoffrey De Smet I really wanted to see this one to figure out how this kind of problems can be solved. Geoffrey showed how it can be done and there are a couple of important tricks. One is basically “learning” how to specify the constraints on the data set. Though the expression language to use is not really difficult, it will probably not be evident on first use. The most important trick however is trying to pre-process your data to have some kind of semi-sensible solution. The drools solver tries to prune the possible solutions by walking through the possibilities in certain ways. While you can make it work with a “bad” start, it will take a lot longer. If you can create an initial situation which already fixes part of the constraints, the system can do a much better job in a short amount of time. In any case, you can determine the limits to define how long the resolver should run.
  • University : “Solr power with lucene” by Eric Hatcher I wanted to see this to know how useful solr is for building a simple document store. I came out convinced that when you want to use full text indexing, you probably want to use the enhancements of solr instead of directly using lucene. You can use the “solrj” library to embed it in your apps. And yes, you can also build the simple document store with it. In fact, thanks to some built in Velocity support, it should be possible to build that with a customized search interface quite quickly. Something I do want to experiment with. An example of the power of solr can be seen at http://search.lucidimagination.com/.
    Some of the added features of solr are tagging, clustering of search results, highlighting and facetting, counting the number or results for a category.
    Eric also gave some interesting tips based on his experience indexing documents. For example, use of the StopAnalyzer (which removes stop words like “the” and “and”) is not really recommended any more. Some interesting references to external project are PDFBox for working with PDF files and LUKE, a lucene index introspector (a similar thing, also called “luke” as it does the same thing, is included in solr).
  • University : “Hibernate search: full text search for Hibernate” by Emmanuel Bernard I have wanted to integrate Hibernate search in equanda for quite a while but haven’t gotten round to it yet. Following this session was very interesting. Emmanuel gave a lot of details of the features and configuration options. Very interesting are the filters support (which can, amongst other things, help for security), and the implicit support for joins, where linked fields are also included in the library. For this last option, you can limit the depth for to prevent exceptions (which would occur when there are loops).
  • “Clojure” by Howard Lewis Ship I was really impressed by this presentation. The introduction to Clojure was quite interesting, though I think I have more of a tendency to investigate Scala instead. I was impressed with the way the presentation was done. The slides were interesting, he expected us to read them (so didn’t repeat what was on them), and he sprinkled some nice images in there to keep us more interested.
  • “HTML 5 Communications – the new framework for the web” by Frank Greco Do I do like to know what is coming, as browser compatibility will hold us back from using HTML5 for another couple of years, I followed this one as there were no other “more interesting” sessions. I had not expected to get the “working now, even in IE6″ pitch. Ok, so you have to use Kaazing. It was not about all HTML5 features, but focussed the new web sockets. Contrary to current XmlHttpRequest, this removes the http headers and allows bidirectional communication. Real server side push and much more efficient ajax communication become possible in this way.
  • “The lift (scala) web framework” by Timothy Perret >Though I haven’t used Scala yet, I am interested in it as one of the languages which promise to allow more easily high performence processing on multi-core cpu systems. So why not look at how they do web frameworks in Scala? It was interesting to see this, though I would need to play a little with the framework to get a feel for it before I can judge. It definitely seems to have a vibrant community. http://lifweb.net/
  • “Infinispan and the future of data grids” by Manik Surtani This is the successor to JBoss cache. It is a little simpler (plain map instead of treecache) but has the advantage of being faster. The aim is to function properly on grids of thousand machines and more. Some interesting sub projects (unfinished) are a lucene directory running on infinispan and a JPA backend for infinispan.
  • “Windmill” by Matthew Eernisse Windmill is an alternative web testing framework which can replace selenium.
  • “Maven reloaded” by Jason Van Zyl A very interesting talk about the upcoming maven 3, which is expected to be released in January 2010. Many interesting things were mentioned. For example, maven 3 will include a maven shell which will help you by building your projects a lot faster than plain maven invocations from the shell. The new maven is going to be faster then maven 2 anyway. For IDE’s (and other system which embed maven) there is the new option to modify the build steps before they are executed. This can be used to prune the steps based on changes etc. and should make building a lot faster. This is already implemented in m2eclipse 0.9.9. Similarly, the reactor plugin (which already exists in maven 2) is included in maven 3 which allows going up or down in a maven project tree, thus allowing you to initiate a build from one module and not building all other modules.
    Maven 3 is already usable now and is 100% compatible with maven. You can find it on the maven download pages.
  • “Pomodoro technique” by Staffan Nöteberg This is a time management technique which helps you to focus on uninterrupted working blocks (of 25 minutes). A nice talk which was attention grabbing with the help of audience involvement, puppets,…
  • “Tapestry 5″ by Howard Lewis Ship An introduction to tapestry 5, a framework I have used for a couple of years now and which I really like. Howard had already warned me that I would probably already now everything he was going to say. Still nice to see this introduction to tapestry 5, which I believe is indeed one of the best web frameworks around.