You are hereDrupal
Today, I'd like to share some big news. Mike Ryan and I have joined Acquia. Acquia is building a new migration practice within its Professional Services team. Acquia thought we'd be great people to kick-start that effort, and I agree! Look for exciting news from Acquia about data migration in the near future.
I announced Cyrve's birth during Drupalcon Boston, 3.5 years ago. Since then, Cyrve has migrated some of the highest profile web properties to Drupal. We migrated over 3 million user accounts to Economist.com (case study) from Cold Fusion and Oracle into Drupal. We migrated millions of comments and millions of images from a custom MSSQL database into Drupal 7's first big site, Examiner.com (case study). We migrated marthastewart.com from Vignette using Web Services.
Over the years, Cyrve has kept on building and improving the Migrate module. Migrate is Cyrve's methodology and toolkit for pulling data from many different sources and importing quickly and accurately into Drupal. Its highwater and update-in-place features keep content synced between live sources and the not-yet-live Drupal site. Migrate has established itself as the solution for importing into Drupal.
Mike and I have always embraced Drupal's open source ethos. Migrate module is free and open source, and always will be. Migrate is available for any organization to use, and many of you are doing just that. Thanks for bringing your sites to Drupal! Mike Ryan will continue to actively maintain migrate module.
I have joined the Office of the CTO at Acquia as Director, Research and Development. This reads like a dream job to me. I’ll focus less on data migration, and more on researching and prototyping new products and features for Drupal and Acquia. I’ll be writing reports about ‘Drupal and Mobile’, and publishing research about commercial CMS systems. I report to Dries and work alongside Angie Byron (webchick). I'll continue to hack on Drupal core and maintain my contrib projects, drush and devel.
I've thoroughly enjoyed building Cyrve over the past three years. Cyrve's revenue and reputation have steadily risen over the years. Lots of its success is due to the awesome platform that the Drupal community is building. The same holds true for Acquia. Acquia's success depends on the success of Drupal as a whole. This alignment of goals was one more smart idea by Dries and Jay when founding the company.
The Drupal ecosystem is thriving these days. I encourage all the Drupalpreneurs out there to scratch your own niche. It worked for me, and it can work for you too.
The first ever Drush code sprint takes place on Monday and Tuesday of next week at MIT in Cambridge, MA USA. The Drush maintainers are flying in from all over the USA and Canada to focus on Drush core, and readying for a release of Drush 5. See http://piratepad.net/CzHfSVWAzX for a list of projects that we hope to undertake.
We've added some interesting new features and capabilities since Drush 4. Here are some already committed features. Some of these have also been backported to recent releases of Drush4.
- Unit test suite. Powered by phpunit, our new test suite keeps the bugs away, and encourages sweeping, aggressive refactoring.
- Windows compatibility has been vastly improved. If you use Drush on Windows, we already recommend using a HEAD development snapshot over Drush 4.
- archive-dump command. Creates a tarball from any existing Drupal site. The tarball contains the code, files, and database dump for that site. These contents are organized into a standard called the Site Archive format. This format is already accepted for import into Acquia hosting, Pantheon hosting, and more to come.
- New cache-* commands exercise Drupal's cache API for setting and getting values.
- Commands executed on remote sites now show feedback in real time, instead of waiting until completion.
- Drush shell aliases are analogous to [alias] in .gitconfig file. That is, one can specify an associative array of "personal" aliases in a drushrc.php file. Those personal (or organizational) aliases are like mini scripts that can call bash commands along with drush commands including hard coded options and arguments. For example:
$options['shell-aliases'] = array(
'pull' => '!git pull', // we've all done it.
'pulldb' => '!git pull && drush updatedb',
'newdb' => drush sql-sync @prod @self
'noncore' => 'pm-list --no-core',
'wipe' => 'cache-clear all',
I'm sometimes asked about how newcomers should integrate themselves into the Drupal ecosystem. Here's my current answer, published for the world.
Unless you have venture capital sized dreams, your best bet is to focus on a Drupal niche. The good news is that plenty of niches are still wide open for the filling. Contact Moshe if you seriously pursue one of these. I'd like to participate.
Wide open niches
- Drupal load testing. Generic load testing firms have little insight into typical failure modes and weak points for Drupal sites. This domain specific knowledge is huge when developing load test plans. After a while, your reusable test plan library saves a lot of cost. Drupal performance firms like Tag1, 2Bits and Four Kitchens touch on this but I think a focused firm would do well here.
- Drupal User Experience (UX). Drupal UX experts who understand Drupalisms such as local tasks, page regions, and pathauto rules are far more likely to produce a sustainable, cost effective design. It would be so nice to have a One Page Redesign or 37 Signals Express that's focused on Drupal.
- Drupal Customer Relationship Management (CRM). Drupal is really well suited to apps like this that need lots of standard and custom fields and relationships between entities. CiviCRM plays here but there is lots of room still.
- Drupal Quality Assurance (QA). Once you have developed test plans for Drupal Commons, Open Atrium, or Open Publish, you can pretty quickly deliver plans to subsequent clients. Since the plans are the real company treasure, you might consider delegating test running to 3rd parties like Sauce Labs.
- Drupal Analytics. Again, knowledge about Drupal URLs and form submissions and user Fields is critical to mining user behavior on your site. The same knowledge is key when setting up and evaluating advertising campaigns with Adwords and similar ad networks.
- Drupal Workflow. Editorial workflows for media companies can be cumbersome to design and implement. Reusable expertise here is a big win. Implementing workflows in Drupal requires knowledge of some special modules (Views Bulk Operations, Rules, Workflow, ...).
Niche providers - great examples
- Cyrve. Cyrve offers just one service: data migration into Drupal from another system. We don't build web sites, or deliver training. We do perform a dozen large migrations a year which is much more than anyone else in the world. Many Drupal shops partner with Cyrve for the migration part of their engagements.
- Drupal Scout. Security review for your site or security training for your development team. Greg Knaddison and Ben Jeavons have established expertise here, and are now reaping the rewards.
- Emma Jane Hogbin. She's a goddess of Drupal training. Emma entertains as she enlightens. Emma is also a master of the Drupal niche. My jaw fell to the floor when she announced her innovative Site Building Extravaganza course. Emma announced that the course will run if 100 people sign up at a cost of $500/person. They signed up, and Emmajane is $50,000 richer. Thats a lot of Tim Hortons.
- More niche rock stars: Mollom, Top Notch Themes, Commerce Guys, and Volacci (Drupal SEO).
Notice that these are tiny companies (less than 3 employees). You can build a company that size.
From Zero to Niche
Your goal for the first year is to build your contribution to the community while growing and showing your expertise. You need to Establish Expertise.
- Start posting. Write on your own blog and/or on groups.drupal.org. Just narrate your journey toward expert. Pretty soon, your posts will generate interest among others in the field. Welcome them. Help them, and ask them to refer their friends.
- Increase readership. Once the blog is humming (5 substantial posts), ask to get your blog added to Drupal Planet. That RSS feed (and Twitter feed) has a large readership.
- Market yourself. Your writings do most of your marketing. But it does help to reach out to media folks like Lullabot Podcast and Drupal Watchdog. They need material, so don't be shy.
- Start speaking. Raise your profile by speaking at your local meetup and/or at a DrupalCamp near you. Eventually try to lead a BoF session or a lecture session at Drupalcon.
- Share code or tools. Share your data migration methodology, or load test plans, or QA plans. Share your templates and style guides. One of the most awesome parts of Drupal is the reputation boost you get by sharing. It can be counter-intuitive to share your treasures. But its been proven to work in this community. We reward people who get it.
- Be a good partner. Consulting firms like Acquia, Phase2, Lullabot, etc. are constantly negotiating on new engagments. They are also consistently understaffed or lacking expertise in niche areas. Make sure they know about your business. When they ask you to join their deal, provide a quick quote and delight the customer.
This weekend, I started a project to standardize on a LAMP and Drupal stack that migrates data as fast as possible. Cyrve's customers tend to have large data sets (e.g. Examiner.com, The Economist, Martha Stewart, World Economic Forum, ...), so insertion speed is crucial. Early throughput observations ...
|My Laptop. Untuned Mysql.||900 nodes/minute|
|My Laptop. MongoDB field storage.||4800 nodes/minute|
|EC2 with RDS hosted MySQL.||1900 nodes/minute|
|EC2 with RDS hosted MySQL and MongoDB field storage.||5300 nodes/minute|
As you can see, adding mongoDB field storage boosted my laptop by 5x and boosted ECS+RDS by almost 3x. Wow! This is consistent with improvements we experienced with Examiner.com.
The dataset I migrated can be found in the migrate_example_baseball module. Here you will find migrate module classes which import a box score from every Major League Baseball game from 2000-2009. All the little bits of the box score (e.g. attendance, batting orders, winning pitcher, etc.) are saved in Fields so this dataset exercises field storage a lot. I think this is typical of larger D7 sites. This dataset should be useful to lots of Drupal benchmarking projects.
All imports were executed using
migrate-import drush command. Like all drush commands, the web server plays no part at all. This is all CLI PHP 5.3, MySQL 5.1, and MongoDB 1.8.
RDS is Amazon's hosted MySQL offerring. I chose a Large instance for all tests. I'm not sure how Amazon configures these, but i would hope that they consider the write-heavy usage scenario is considered.
Caveat: These are not formal benchmarks. I did not try to optimize these environments.
We already know that MongoDB field storage massively speeds up web page views (i.e. reads). I'm seeing here that it massively speeds up writes as well. Please consider using MongoDB for your large Drupal projects. MongoDb compatibility on its own is a compelling reason to choose Drupal 7 over Drupal 6. Learn more about MongoDB and Drupal from this Drupalcon session.
The Drush project has been on fire in the past year. In January, we released Drush 4. I realize that we never properly introduced it. So, here are the highlights ...
Much more documentation
Like most unix commands, Drush has always had pretty strong command specific help. Just append --help to any command and you learn much of what it can do. In Drush 4, we added 14 topics and examples which are long-form help that's usually not related to a single command. For example, we discuss site aliases and the Drush bootstrap. The new command
topic is your tool for reading these docs. Topics are just text files, and modules/commandfiles can easily ship with them.
Adrian Rollett (acrollet) contributed 9 handy commands for managing user accounts on your Drupal site. Dealing with spammers or setting up new staff has never been quicker:
- user-add-role. Add a role to the specified user accounts.
- user-block. Block the specified user(s).
- user-cancel. Cancel a user account with the specified name.
- user-create. Create a user account with the specified name.
- user-information. Print information about the specified user(s).
- user-login. Display a one time login link for the given user account
- user-password. Set the password for the user with the specified name.
- user-remove-role. Remove a role from the specified user accounts.
- user-unblock. Unblock the specified user(s).
Moshe Weitzman (that's me) added 5 commands specific to Drupal 7's Field API. In particular, they try to take some drudgery out of populating a content type with a bunch of Fields. These Drush commands get you started quickly. Some GUI work is often needed to configure the instances, formatters and widgets. The commands are:
- field-clone. Clone a field and all its instances.
- field-create. Create fields and instances. Returns urls for field editing.
- field-delete. Delete a field and its instances.
- field-info. View information about fields, field_types, and widgets.
- field-update. Return URL for field editing web page.
Project Manager (pm)
The beloved project manager commands were enriched significantly during this release cycle. I'd like to thank Jonathan Araña Cruz (jonhattan), who did an awesome job as the maintainer of these commands.
pm-download. This command, better known as
dl, quickly downloads projects from drupal.org or feature servers. In Drush 4, we added:
- --notes option shows release notes after downloading projects.
- --select option asks you to choose from a few recent releases or the development snapshot release. --select --all lets you choose from all available releases for a given version of Drupal.
pm-updatecode. This command updates some or all of your projects to their latest recommended release. In Drush 4 we added:
- --lock. Pin a project at its current release such that pm-updatecode will skip pending updates.
- --security-only. Only update projects that have security releases pending.
- --show release notes after updating projects.
- Drush checks for pending releases of itself and asks you to run selfupdate if any are pending.
pm-uninstall. You may now use wildcards when specifying extensions (aka modules/themes). For example,
drush en views* uc*.
The above commands are now integrated with git.drupal.org so that you can optionally use git clones instead of tarballs. Lets all work together at http://groups.drupal.org/node/93449 so that Drush can help bring consistent, deep git.drupal.org integration to our sites.
We tweaked our SQL commands this release cycle. Perhaps you will enjoy …
sql-dumpgets a new --gzip which compresses your DB backup file.
sql-syncis a jaw dropping time saver., It dumps, transfers, and loads your database from one server to another. new in Drush 4 is --sanitize, an optional phase for stripping sensitive information such as passwords and emails from database. Modules and commandfiles can do additional stripping as needed.
sql-drop. A new command to drop all tables in your current database.
Finally, we added support for SQLite and MSSQL, for those awesome Drupal 7 sites that prefer alternative database platforms.
selfupdate. A new command which updates Drush itself when a new release is pending.
core-cli. A custom shell environment thats optimized for Drush.
—pipe. Show all the nifty .bashrc code that this shell uses so that you import it into your usual shell. Very useful.
image-flush. Delete all image styles (D7 only). A companion command to pre-generate image styles would be nice for Drush 5.
php-script. Drush scripts are a lightweight alternative to Drush commands. Drush commands are not heavy weight at all but CLI developers can be extraordinarily lazy :) ... Scripts may now be standalone files that can be directly called from the CLI. That works because of their shebang line (first line). See examples/helloworld.script for an example. These Drush scripts have easy access to any CLI arguments that were used.
site-install. Programmatically install Drupal 6 (new) or Drupal 7 site. Can use any install profile or language.
site-upgrade. Fully automated upgrade script from Drupal 6 to Drupal 7. Re-run this command over and over as you tweak your update functions and available modules. Saves a lot of time.
test-run. Run some or all of the simpletest tests found in your site.
In addition to those people I've mentioned, thanks to Greg Anderson (greg.1.anderson) for many contributions such as standalone Drush scripts (see
core-cli functionality in your own shell, --sanitize for
sql-sync, and boatloads of documentation. Finally, Mark Sonnabaum (msonnabaum) has accepted my request for him to be the Drush 4 branch maintainer. Thanks for your service, Mark.
Cyrve developed the Migrate module to support robust, repeatable data migrations into Drupal. There are other solutions to this problem, but I believe none are as robust as migrate. This article focuses on Migrate version 2.
Lets fast forward past the development process and focus on performing the import. Migrate provides Drush commands for performing the import. There is currently no user interface for performing the import. Drush is better suited to that task, since it is not subject to PHP timeout. Even Drupal's batch API is ill suited to import since it has to re-bootstrap Drupal for each batch.
I'm assuming that you are working with a snapshot of your Drupal site and thus it is safe to rollback as needed. On the source side, you can work with your live site or a snapshot. Migrate does not change anything in your source site so feel free to pull from live.
migrate-status command provides an overview of all your migrations. It tells you what migrations are enabled, and how far along they are in their import.
migrate-import command is your workhorse. It is responsible for fetching source records and saving them into Drupal. Further, it maps the ID of each source item to a Drupal ID. So, legacy accountId might be mapped to Drupal's userID. Here is migrate-import in action:
migrate-import supports lots of useful options such as idlist, itemlimit, and more. As with all drush commands, you append --help to see its documentation:
You can rollback a given migration or all migrations at any time with
migrate-rollback. Migrate uses the aforementioned map table to know exactly which items to delete during rollback.
If you get really flumoxxed, you always have the option of discarding your Drupal database and restoring from backup.
Migrate's other commands are more specialized, or useful only to migration developers. For the record, here is the complete list:
Lets talk more about that gigantic global party - #D7RP. On one day, January 7, Drupalers threw 326 parties in 96 countries. These were real world parties - in meatspace. They featured young and old humans, eating and drinking and dancing. In many ways, this was our finest moment as a Drupal community. I just love how we celebrated separately, yet together. That's how we roll.
I'd like to recognize groups.drupal.org as a silent enabler for this wonderful accomplishment. There are 810 regional groups on groups.drupal.org. Have you seen how busy our Event Calendar is?. Every weeknight, there are meetups in multiple cities. Thats where much of the Drupal teaching and learning happens. Screw Facebook Connect, this is Drupal Connect. I started groups.drupal.org in early 2006. I think it is coming along nicely, no?
groups.drupal.org was not in scope for the recent drupal.org redesign. It's age is showing a bit. I hope someone can give it some focused attention soon. Your ideas are welcome in the Maintenance group.
And while we are reminiscing, lets remember that Drupal 1.0 was released on January 15, 2001. Happy 10th birthday, old friend. Don't fret, we'll celebrate in Chicago.
The Drupal 7 release is imminent. It looks like the final release will happen before this year's winter solstice (Dec 21/22).
cvs tag DRUPAL-7--1-0
Maintainers who didn't pledge are encouraged to release also. We're equal opportunity cheerleaders.
I'd like to thank Dries and the Acquia management team for their generous contribution today. Dries blogged that they have allocated one full time engineer (Katherine Senzee) to work on Drupal 7 critical issues. Phew. The volunteer fire department that has been slaving away on Drupal 7 criticals was getting exhausted. We really need this.
If we think back to the Drupal 6 release, Acquia contributed the same thing by paying Gabor to work on criticals until the release was ready. Apparently, this is what it takes now in order to get core Drupal out the door.
Acquia has shown a real talent over the past three years in aligning their business goals with those of Drupal. Off the top of my head, they also have: