Thursday, 9 December 2010

Render Farm Design

I'm in the process of re-designing the main render farm infrastructure for the NCCA, so thought I would post the initial design considerations as part of the ongoing post of design examples for the students.

At it simplest level a render farm is a method of distributing the rendering tasks amongst a series of processors where each of the processors will process a job (usually the rendering of a single frame). There are many commercial solutions to this but each have different advantages and disadvantages, the discussion of these is really outside the realms of this post but the decision was made to write our own flexible solution rather than a "out the box" solution has been made.

The original farm is described here the new version will extend this basic idea and add new features as well as being more extensible to meet the need of different types of rendering and simulation.

Basic System Outline
The basic system is a homogeneous collection of networked machines each of which has the relevant software installed to render.  To this a series of transparent network attached storage is available.  As this is a unix based system we don't need to worry about drive letters etc just that a fully qualified path is visible to the render software and it can be accessed.

The basic system looks like this

The basic process involves exporting a single renderable file per frame. This is easy for renderman and Houdini as we can generate rib and ifd files for prman and mantra respectively. However for maya there are problems as maya batch render works on a single maya file and causes problems with file access latency. This can be solved by exporting a single mental ray file per frame but we don't at present have stand alone mental ray. So at present the solution will be for prman and mantra.

Once these files are generated, they may be submitted to the farm for rendering. To allow for multiple machines to render these files we need a centralised repository for the user information as well as the location of the data etc. To do this we use a MySQL database. This is used as it provides a good open source solution for the query and collation of the data and is easy to interface with C++, Python and Qt which are our main development environments.

Submission Process
Renders need to be submitted in batches where a batch is a collection of frames to be rendered. The user may prioritise these batches and pause them. There is also options to send a mail when finished and create a small movie of the frames. 

Other options will be added at a later date for example to examine frames and stop a batch if more that 3 frames are all black or all white (usually because lights are not enabled etc).

Output from the render and any errors will also be logged so a user may investigate any errors from the render etc. 

There will be a PyQt application to do the submission and management of the users renders as well as a web front end for the diagnostics.

Each submitted file will be checked to ensure the same frame is not submitted multiple times, this is done by calculating the MD5 sum of the file and using it as a unique key in the database.  

Standard Unix username and group identifiers are use for the user identification so a user must be logged in to submit and manage frames, and thus can only manage their own jobs. Other unix tools will also be used to send mail (with email address extracted from the yp database )

Load Balancing and Scheduling
The system schedules jobs based on a number of criteria, initially the user with the least render time and least number of jobs will be selected. After this the priority of the batches are considered with the highest priority batch being selected first (with 0 being highest and 99 lowest). Within each batch jobs are also ordered based on the output frame number ( Frame.0001.rib Frame.0002.rib etc).

Further refinement of the selection can be based on groups such that year groups and individual course work deadlines may be prioritised.

The main aim of this process is the removal of an overall render wrangler role, and jobs will be selected in a fair manner, with the overall load averaging out. These values will be re-set at regular intervals to not penalise early use of the farm for test renders etc. 

Render Client
The render client on each of the worker machines will have a selection of roles, determined via a table in the database, for example the old render farm blades are only 32 bit but can still be used for compositing, so only compositing jobs will be passed to these machines.

Each desktop machine will monitor load and if a user is logged in and start rendering if the load is below certain criteria. If a user logs into a machine whilst it is rendering the job will be lowered in priority once the users tasks reach a certain CPU / Memory load. 

It will also be possible to turn batches of machine off from the farm by disabling the node from the client list. 

At present for most software we have enough command line render licenses to cope with all the machine s in the groups so license allocation will not be an issue at present but needs to be considered in the larger picture of design at some later stage.

Initial Table Designs
The following scans show my initial design sketches
The scans show the main outline of each of the tables and some of the data types. More importantly we can see the relationships between the tables as well as some areas which I have already normalised whilst not fully normalised this is enough for the speed and access to data we need. Further investigation of this will be tested once the initial system is developed.

Database Development
To develop the database the excellent MySQL Workbench has been used. This allows the visual development of the database tables and the Forward / Reverse engineering of databases. The initial tables from the design above are show in the following diagram

The workbench tool will generate SQL scripts for the creation of the tables, for example the following script produces the userInfo table
CREATE  TABLE IF NOT EXISTS `render`.`userInfo` (
  `uid` INT NOT NULL ,
  `numRenders` INT NULL ,
  `renderTimes` TIME NULL ,
  `lastRender` TIMESTAMP NULL ,
  `userName` VARCHAR(45) NULL ,
  `loginName` VARCHAR(45) NULL ,
  `course` VARCHAR(45) NULL ,
  `gid` INT NULL ,
  PRIMARY KEY (`uid`) )

It is important to note in the above SQL that we are using the InnoDB engine as this is the only one that supports foreign keys in MySQL.

The data for this table is generated from the unix yp system using a simple python script, first we use ypcat passwd to grab the file. Which is in the following format
jmacey:x:12307:600:Jonathan Macey:/home/jmacey:/bin/bash
The first entry is the login user name, the 3 the user id (UID) the 4th the group id <GID> and the 5th the long username.

The group values can be extracted from the group file which has the following format

From this we use the 3rd entry as the key into the first list to extract the text GID value. This is shown in the following python script.

import MySQLdb
# import os for dir list etc
import os, commands, getopt, sys

def usage():
 print "AddUsers : add users to the renderfarm db"
 print "(C) Jon Macey"
 print "_______________________________________"
 print "-h --help display this message"
 print "Usage AddUsers [group file] [userfile]"
 print "Where userfile is the output of ypcat passwd"
 print "searches for username and UID from  this file and adds it to the db"

def readGroups(_filename) :
 print "reading Group file creating dictionary"
 # read in all the data
 #now read through the file and try and  find the UID and username
 for line in data :
 return groups

def addUsers(filename,groups) :
 # here we create a connection to the DB
 if DBADDR =="" :
  print "RenderDB is not set please set to master server"
 print DBADDR
 DBConnection =  MySQLdb.connect(host=DBADDR, user="RenderAdmin", passwd="xxxxxxx",db="Render")
 # now we create a cursor to the table so we can insert an entry
 cursor = DBConnection.cursor()

 # so we open the file for reading
 # read in all the data
 #now read through the file and try and  find the UID and username
 for line in data :
  #                      0  1  2     3   4
  # data in the form jmacey:x:12307:600:Jonathan Macey:/home/jmacey:/bin/bash

  query="insert into userinfo (uid,numRenders,renderTimes,lastRender,userName,loginName,course,gid) values (%d,0,\"00:00:00\",\"00:00:00\",\"%s\",\"%s\",\"%s\",%d);" %(uid,userName,loginName,groups.get(line[3]),gid)
  # close the DB connection
 cursor.close ()
 DBConnection.close ()
# end of QueryJobs function

class usage(Exception):
    def __init__(self, msg):
        self.msg = msg

def main(argv=None):
 if argv is None:
  argv = sys.argv
   opts, args = getopt.getopt(argv[1:], "h", ["help"])
  except getopt.error, msg:
   raise Usage(msg)
     except usage, err:
   print >>sys.stderr, err.msg
   print >>sys.stderr, "for help use --help"
   return 2
 for opt, arg in opts:
  if opt in ("-h", "--help"):

 if len(sys.argv) != 3 :
   print "no group and password file passed"
 print groups
if __name__ == "__main__":

The first pass loads the group file and generates a python dictionary of values the first the numeric key the 2nd the text group name. The next pass reads the users file and inserts the data into the userInfo table ready for use.

The next stage of the process is to develop a submission script. This will be the subject of the next post in this area, where I will also go into more detail of the Python SQL interface and PyQt database interation.

1 comment:

  1. ForRender is the best helper to get your Render done fast and cheap. 3780GHz top power ready to work on full load for 60$ per hour. It's less then 1,5 cent for 1 GHz per hour. Full time support and good discount's for everyone. render farm