[Radiance-general] ranimate, recovering from broken rpicts?

Jack de Valpine jedev at visarc.com
Sun Feb 19 18:59:32 CET 2006


Hi Lars,

In brief again. I did some preliminary experimentation with OpenSSI a 
couple of years ago. I think that it shows a lot of promise and a very 
supportive user group. I believe a couple of the developers are from HP 
and thus really interested in developing a robust and stable platform. 
They have integrated some of HPs clustering technology as well as tools 
from other systems. I believe that they have utilized some elements of 
the OpenMosix load balancing algorithm.

As you know I did experiment with OpenMosix as well prior to learning 
about OpenSSI. At the time my experiments lead me to recognize a problem 
with OpenMosix when jobs requiring large allocations of memory start up, 
(eg multiple jobs get started on a given node, the jobs need to load to 
memory prior to getting migrated, however the memory requirement exceeds 
that of the local node thus swapping occurs). Thus my instinct if I were 
to use openmosix, would be to forgo the automated migration mechanism 
and develop a simple scheduler that would send jobs out to specified nodes.

If I decide to move forward with a clustering solution, my instinct 
would be to go with OpenSSI at this time. This would be from standpoint 
of stability, robustness, shared filesystems and development team. But 
this is just my bias without anything well documented at this time to 
support it. There was an excellent paper written on Single System Images 
comparing (openmosix, openssi and Kerrighed)

    http://www.irisa.fr/paris/Biblio/Papers/Lottiaux/LotBoiGalValMor05CCGrid.pdf


The latter is a clustering solution out of a French research group, 
though I am not sure if it available in any kind of stable release, (I 
have not checked).

I am happy to see that you and Francesco at least have made some real 
efforts to use OpenMosix in a production setting. It would be great to 
hear about some of your experiences thus far. I think that the real 
opportunity of these clustering systems is as follows:

   1. single process space across nodes
   2. shared filesystems that do locking/caching correctly (ie more
      stable than NFS)

There are many others, but these are the main features that come to mind.

Best,

-Jack de Valpine

Lars O. Grobe wrote:
> Hi!
>
>> I agree with Greg, and I think you can launch rpict -ro directly
>> on your "master" node and wait for automatic migration,
>> or use the "runon" or "migrate" scripts to move the jobs to your
>> preferred nodes, if necessary.
>
> I usually start all jobs on my local node and let them get migrated. 
> This will usually take some minutes, but as the rendering time for a 
> picture is better described in weeks than days at the moment, the 
> startup time is not important as longs as the nodes do not run out of 
> memory. Also I think I should write a small how-to, as this way of 
> distributing renderings works really nice (as long as the network is 
> stable, else the ssh or former rsh way is more fault tolerant).
>
> I started the rpict processes just the same way ranimate would do, as 
> far as I know rpict -ro will find out which view to use from the 
> viewfile as this should be containes in the image header, right? So 
> the command is
>
> nohup /opt/openmosix/bin/mosrun -c /opt/radiance/bin/rpict 
> @stills/render.opt -w0 -ro stills/frame003.unf scene_illum.oct &
>
> for the third frame. In fact, it is amazing again and again how 
> powerful these small little radiance tools are, e.g. that I can 
> recover that easily...
>
> One other question, did anyone use openssi with radiance? It should 
> even allow to use rpict with shared memory (-PP), but I could not 
> install it so far because it like to live in it's own network, and my 
> machines have to integrate in an existing network.
>
> CU Lars.
> ------------------------------------------------------------------------
>
> _______________________________________________
> Radiance-general mailing list
> Radiance-general at radiance-online.org
> http://www.radiance-online.org/mailman/listinfo/radiance-general
>   

-- 
# Jack de Valpine
# president
#
# visarc incorporated
# http://www.visarc.com
#
# channeling technology for superior design and construction

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://radiance-online.org/pipermail/radiance-general/attachments/20060219/21b16eec/attachment.html


More information about the Radiance-general mailing list