Drupal Smallcore: Where Do We Start?

Posted Nov 4, 2009 // 8 comments
Irakli:

There has been increasing amount of discussion about the Smallcore initiative in the Drupal community. Smallcore's purpose is to separate Drupal's CMS capabilities from its core APIs and enhance Drupal's characteristics as a universal web-development framework. While most people agree that some level of change is due, the debate around what exactly should smallcore be is still very active.

Before we dive into the details, I would like to make a disclaimer: I whole-heartedly support smallcore in the aspect of making Drupal core more modular, leaner and streamlined (less one-off, spaghetti code thrown together, please). At the same time, when people talk about substantially higher level of generalization, even "fully generic system" - that scares the heck out of me.

It "scares" me, because I've been intimately familiar with one world that has been promising and delivering "fully generic" APIs and specifications for years now: Java/J2EE. A world where abstraction is taken to a complete extreme: everything is generalized, fully-XML-configured and annotated, but in the end - we just get a platform too messy and too slow for Web site development (disclaimer: I think, Java is brilliant in many other use-cases, just not so much for building web sites). Frameworks like Java Server Faces are both very cumbersome, as well as SLOW. And that with bytecode running in JVM! Imagine interpreted scripting code attempting the same level of abstraction in Drupal.

Balance is Core to Smallcore (no pun intended).

I think we have to be very mindful of the level of generalization we really need. When Ruby On Rails came out of shadows and took industry by storm, the biggest thing it had going for itself, vs. say Java/J2EE, was less abstraction. David Heinemeier Hansson removed a lot of abstraction, implementing meaningful default behaviors instead. As a result development became much easier/faster.

A good example from the Drupal world is jQuery. jQuery has served Drupal wonderfully. The level of integration has grown over the years and the benefits from integration have, in my opinion, far exceeded the inconvenience of being tied to one specific framework. On the other side Drupal still supports "pluggable" theming engine implementation, but how many modules/implementations really use any other theming engine except PHPTemplate? Why do we support the ability to drop-in another theming engine? Do we really need the abstraction we've been carrying for many years now?

Another good example is the content-centric approach to the build-out that has been the core philosophy behind Drupal and has served us well. While the actual implementation of the philosophy has certainly evolved: from Nodes with CCK to a full-blown Fields API, the basic principle and approach is still the same and uniform. We just got better at implementing it. This is an important example: Drupal did not offer support for numerous non-content-centric implementations, it just improved the existing one. Would we have benefited if we could implement several approaches instead?

In smallcore debates you often hear an example of login screen - how it sometimes needs to be "undone" and how the process is hard and painful because authentication API is not flexible enough. OK, but if authentication API is broken, is the solution to allow people build alternative (possibly also broken) authentication APIs or is the solution to fix the Drupal authentication API so it is properly modularized and useful?

The questions stated above, put together, lead to the main question of the smallcore effort: Can we agree that, for now, jQuery, content-centric approach (Fields API), Security system's basic API, PHPTemplate, hook system and possibly several other things are indeed smallcore and don't need further abstraction?

One of the strong forces in the Drupal community has always been the tendency to join forces and fix broken stuff, rather than try create duplicate solutions. Is overly generic core a threat to that force?

What is Core?

One of the main characteristics of creating a successful microkernel/small core implementation is making a crucial decision about what needs to be abstracted and what is essential to the system. Basically - once you take everything out, whatever stays in the core should not be overridden and does not need to be generalized/abstracted-out.

We, at Phase2, have built both some very large, complicated, high-traffic websites using Drupal, as well as two popular Drupal-based products a.k.a. distributions: OpenPublish and Tattler (app). On none of these projects was Drupal's standardization on things like content-centric approach, using Views or exclusive usage of jQuery a problem. On the contrary, these standard approaches have allowed for more streamlined implementation and much shorter delivery time than what would have been possible, if we'd used a more abstract framework. Granted things can always be improved (and that's why we see new Fields API in Drupal7 core), but let's also not forget that there's a genuine set of architectural approaches that make Drupal what it is and why we like it.

So, the million-dollar question is: what are the essential building-blocks that define Drupal as a unique framework, therefore constitute its smallcore (or microkernel or "gecko" or call it whatever you like) and that do not need to be over-ridden or further abstracted, but should be highly optimized for performance and scalability?

About Irakli

Irakli is Director of Product Development at Phase2 Technology. His main responsibility is development of packaged, turn-key solutions using open-source technologies and cutting-edge semantic APIs.

Irakli has been an avid open-source ...

more >

Read Irakli's Blog

Comments

by Eric A (not verified) on Wed, 11/04/2009 - 16:34

Link to Open Publish

Shouldn't the link to Open Publish be... http://www.opensourceopenminds.com/openpublish ?

Eric

by febbraro on Wed, 11/04/2009 - 17:08

Yep

We fixed it, thanks. Actually the cannonical URL is http://openpublishapp.com

by Jeff Eaton (not verified) on Wed, 11/04/2009 - 16:39

Wholehearted agreement

You bring up some excellent points. Before working with Drupal, I spent about four years maintaining a C# based enterprise product that was very heavily metadata driven. Changing database tables meant hand-editing a 2 meg XML file named 'objectschema.xml' and only the fact that we were using a compiled language with a lot of built-in support for XML munching saved us from ourselves.

Trying to make a "Generic System-Building System" in any language is bound to end in tears. A language like PHP is one of the worst to attempt that in, as it lacks a lot of the baked in introspection and metadata management systems that are part of heavier languages with more presence in the enterprise.

In my mind, Drupal core's tendency towards 'abstract everything, provide a framework for everything, make everything metadata driven, commit to nothing' is actually a symptom of the Framework/Product tethering that we currently suffer from. There's strong, strong pressure to improve the core Drupal *product* but the only way to do that is to tailor it for a given use case. That's unacceptable, though, because it's also a framework that other, contradictory use cases rely on! And thus, the core product has to implement all of its tailored features in a very abstract fashion, with endless hooks and alterable structures in place so that people can strip away the functionality (or tweak it) if needed.

Although I try to fight the idea that 'smallcore' is about stripping stuff out of Drupal, the idea of allowing the core product to include one-off functionality if it needs it, rather than forcing that functionality into the framework, can lead to a smaller, lighter Drupal core. While the default Drupal 'Product' may get heavier with more UX work, more bundled functionality, and more tailoring, it needn't drag the framework along for the ride.

by Jeff Eaton (not verified) on Wed, 11/04/2009 - 16:55

Also, in answer to the closing question...

So, the million-dollar question is: what are the essential building-blocks that define Drupal as a unique framework, therefore constitute its smallcore (or microkernel or "gecko" or call it whatever you like) and that do not need to be over-ridden or further abstracted, but should be highly optimized for performance and scalability?

The "highly optimized for performance and scalability" point is also a good place to point out the need for flexibility. The performance and scaling challenges for individual sites differ wildly -- Drupal core currently does a reasonably good job with things like swappable cache and session handling. There are other places where we can do better, but this model -- of providing key services and making their implementations swappable -- is a great model.

The underlying architecture of Drupal provides us with things like session management, a granular caching layer, url routing, translation, file handling, acl/permissions checking, an event-driven plugin system, a templating layer for HTML output, and so on. There are also things that are very common, but not required: the use of Nodes, and taxonomy, and CCK fields. The use of Views. The use of Actions to wire together events and use-facing responses...

The 'services' provided by core can be built upon by modules, and solutions for end users are built out of modules. The danger, I think, is when we start piling more and more functionality onto the 'core' with required modules, baked-in assumptions, and so on. Views is in use on 60% of Drupal sites, but I don't think that means that it should be part of core. It should be easy to roll a profile that includes it, as well as modules that depend on it, but that doesn't necessarily imply that it is an underlying core 'service.'

Just my $.02 -- not necessarily representative of everyone who is advocating the 'smallcore' approach. I definitely think that it's worth discussing and ironing out our different perspectives as we move forward.

by irakli on Wed, 11/04/2009 - 17:13

Three Bins

Thank you, Eaton, for a very thoughtful comment.

I think we are on the same page that "framework-y" things should eventually distribute in three major bins:

  1. Kernel - not swappable (maybe, but not neccesarily: security API, theming API, jquery...?).
  2. Kernel - swappable (session handling, cache handling...?).
  3. Framework APIs in Contributions - these are different from modules that provide independend functionality like OpenCalais, in that these are still related to framework, and usually used for infrastructure but are not necessary enough to be part of kernel (in your example Views was such module).

I think finalizing the agreement on what exactly goes into these three bins is one very important step towards the new core.

Another point I would like to make is that, when putting things together or separating them, aside from architecture, we have to also think about release cycles.

Drupal is not a proprietary software, it's community-maintained. Every time we put something closer to core, we inevitably make its release cycles less frequent. For modules like Views, that are definitely framework modules, this consideration may be decisive for whether they should be part of the smaller core or not.

by Larry Garfield (not verified) on Wed, 11/04/2009 - 19:09

Emphasize bin 2

Very valid points! I will, however, call out one comment in the original article:

OK, but if authentication API is broken, is the solution to allow people build alternative (possibly also broken) authentication APIs or is the solution to fix the Drupal authentication API so it is properly modularized and useful?

That's not an either/or. A system that is "properly modularized" will, by nature, be more pluggable than one that is not. Even if you don't go as far as making it explicitly pluggable, the more you separate two subsystems and reduce their contact points the easier it is to replace one without breaking the other. It also means fewer bugs (because the interactions are fewer and more localized), easier to follow code (generally), and generally higher quality code overall.

Even if you don't make it truly pluggable, moving in that direction tends to improve the code. (Not always, but usually.) Of course, it's often hard to convince yourself of that when writing something unless you really do make it pluggable. In a sense, making something pluggable is a way to force yourself to make it clean and modular. (You can of course make totally horrific pluggable systems, and Drupal has its share, but that's the idea at least.)

That's one of the reasons we maintain a pluggable theme layer, even if it's not really used. It forces us to think about how the theme system will be used and what parts of it really belong where. That's the example, actually, that Jeff Eaton gave me over dinner one night while I was bemoaning how difficult it was to support PostgreSQL in early versions of DBTNG. :-) Just because no one uses Postgres doesn't mean we shouldn't still support it, because forcing ourselves to support it forces architectural decisions that improve the overall system in the first place.

And you know what? He was absolutely right. Now we have full support for 3 databases, a 4th in contrib, and the overall DBTNG architecture is much cleaner and better than it would have been otherwise.

Making something pluggable can be an excuse to force yourself to make it right. That you also enable other people to swap it out themselves (with a little or a lot of effort) for their own implementations that do things you never dreamed of is a nice side effect. :-)

by Ronald (not verified) on Wed, 11/04/2009 - 19:52

Absolutely agree

In fact, just earlier today I wrote a post where I try to identify what are the core aspects of Drupal and, hence, represent that things that one should not mess with. In the end I ended up with hooks, the menu system (routing), the PAC pattern for interaction and the non-fixation with OO. This is more restrictive than your non-swappable kernel but I was looking for pure architecture ideas while you are taking a necessary minimum infrastructure approach.

I think the two put together provide the beginning of an answer.

My post for more details - http://www.istos.it/blog/small-core/drupals-basic-building-blocks-search...

by drifter (not verified) on Thu, 11/05/2009 - 04:34

smallcore is interoperability

Coming from a Rails background, no one would argue that Ruby on Rails is a pure framework. However, there are very few high-level modules/plug-ins - most are APIs - because basic stuff like authentication, permissions are not built in, and things like localization, routes for plugins (hook_menu) are just very recently added.

Every couple of months there is a new authentication system-du-jour, acts_as_authenticated, restful_authentication, clearance, authlogic, devise - with mutually incompatible user tables, and the don't even tackle permission.

So I'd say the authentication, permissions, localization features should definetly be #smallcore, they allow modules to integrate well.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <strong> <code> <p> <img> <ul> <ol> <li> <h2> <h3> <h4> <b> <u> <i>
  • You may insert videos with [video:URL]

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.