Drupal: Programmaticaly creating nodes

Note: Work in progress, need to jot down for future reference.
When migrating content from the old system to Drupal, we need some programmatic way of doing that. If we did some Google search with keyword such as "programmatic Drupal", two methods become obvious - node_save() and drupal_execute(). Before we go further, better to clarify that both methods are not exactly some kind of Data API that Drupal provide to programmaticaly import external data. It far from that. It just happened that we can leverage both functions to help us put external data in place within Drupal structure. But because it's not designed for that task, we have to deal with a lot of corner cases.
drupal_execute()
- Flexible, can be used for any kind of contents.
- Need some work to identify the Form API structure being used.
node_save()
- Basic task like populating based node is easy. If you don't have any CCK based content, this is the easiest way to import external content.
- As the name implied, only work for node content.
I've decided to use node_save(). Basically the following is all we need to create a new node:-
<?php
$node = new StdClass;
$node->title = 'external node';
$node->type = 'news';
$node->status = 1; // Published
$node->body = 'some content...';
node_save($node);
?>[UPDATE] Just noticed that the $node argument to node_save function is passed by reference so theoretically we don't need the following trick. Not tested though.
But then one thing become apparent, node_save() doesn't return anything so it's imposibble for you to manipulate the newly created node further such as populating the translation table if you must have multilingual content. Luckily, Drupal 5 still use sequence (for PostgreSQL and custom sequence table for MySQL) to get the node id instead of relying on auto_increment value. That's why I prefer 'next_id' rather than 'last_insert_id' approach but let's keep this for some other topic. Unfortunately, the following code won't work:-
<?php
$node = new StdClass;
$node->nid = db_next_id('node_nid');
$node->title = 'external node';
$node->type = 'news';
$node->status = 1; // Published
$node->body = 'some content...';
node_save($node);
?>It because when node_save detect a presence of nid properties, it thought that suppose to be an update instead of creating new one. So we have to fall back with the plain SQL first to insert the base node content:-
<?php
$node = new StdClass;
$node->nid = db_next_id('node_nid');
$node->title = 'external node';
$node->type = 'news';
$node->status = 1; // Published
db_query('INSERT INTO node (nid, title, type, status, created) VALUES ($node->nid, $node->title, $node->type, $node->status, $node->created)');
$node = node_load($node->nid);
$node->body = 'some content...';
node_save($node);
?>Database type people would say, ugghhh... that slow. Why not put all the damn things in one INSERT statement ? PostgreSQL ftw ..... Drupal do a lot of things at the back when populating new node. Unless you want to go figure everything that it did, better stay with the existing API as long as you can tolerate the performance penalty. You need to do this once only after all.
The above code only for basic node content. If you have a CCK field, this is how you set the node properties:-
<?php
$node->your_cck_fieldname = array(
array(
'value' => 'some string ...',
'format' => 4
),
);
?>If the CCK field use select widget, the syntax a bit different:-
<?php
$node->your_cck_fieldname['key'] = 'the string value ...';
?>'key' is hardcoded value here. You need to enter as is. Took me some time to figure this out digging through drupal.org and some Google arts.
If your field is of CCK Filefield type where you allowed user to upload file through it, this is how you can specify it programmatically (provided you have move the related files to the correct place):-
<?php
$node->field_video = array(
array(
'fid' => 'upload',
'title' => $filename,
'filename' => $filename,
'filepath' => $dest,
'filesize' => $filesize
)
);
?>'upload' also a hardcoded value here. Need to enter as is. I created a generic function for this based on this artice:-
<?php
function modulename_import_filefield($node, $field_name, $file) {
// Get the file size
$details = stat($file);
$filesize = $details['size'];
$filename = $file;
// Get the path to your Drupal site's files directory
$dest = file_directory_path().'/'.$node->type.'/'.$node->nid;
// Copy the file to the Drupal files directory
if(!copy($file,$dest)) {
echo "Failed to copy file: $file to $dest\n";
return;
} else {
// file_move might change the name of the file
$name = basename($file);
}
$node->$field_name = array(
array(
'fid' => 'upload',
'title' => $name,
'filename' => $name,
'filepath' => $dest,
'filesize' => $filesize,
'list' => 1,
'description' => $name,
'filemime' => mimedetect_mime($name),
)
);
$node = node_submit($node);
node_save($node);
return node_load($node->nid);
}
?>Some note here, I used the plain PHP copy() function here instead of Drupal API file_copy() function.
Attaching user/authored by field
Even after doing something like this at the beginning of my import code:-
<?php
global $user;
$user = user_load(array('uid' => 1));
$node->uid = 1;
?>The node would still be attached to Anonymous user. Looking at node_submit function:-
<?php
if (user_access('administer nodes')) {
// Populate the "authored by" field.
if ($account = user_load(array('name' => $node->name))) {
$node->uid = $account->uid;
}
else {
$node->uid = 0;
}
}
?>It became obvious what you need to do is attaching username to the node instead of uid. So the code should be like this:-
<?php
$node->name = 'admin'; // or whatever username that has 'administer nodes' permission.
?>Other references
- Tags:

