[Radiance-general] ranimate, recovering from broken rpicts?
Jack de Valpine
jedev at visarc.com
Sun Feb 19 18:59:32 CET 2006
Hi Lars,
In brief again. I did some preliminary experimentation with OpenSSI a
couple of years ago. I think that it shows a lot of promise and a very
supportive user group. I believe a couple of the developers are from HP
and thus really interested in developing a robust and stable platform.
They have integrated some of HPs clustering technology as well as tools
from other systems. I believe that they have utilized some elements of
the OpenMosix load balancing algorithm.
As you know I did experiment with OpenMosix as well prior to learning
about OpenSSI. At the time my experiments lead me to recognize a problem
with OpenMosix when jobs requiring large allocations of memory start up,
(eg multiple jobs get started on a given node, the jobs need to load to
memory prior to getting migrated, however the memory requirement exceeds
that of the local node thus swapping occurs). Thus my instinct if I were
to use openmosix, would be to forgo the automated migration mechanism
and develop a simple scheduler that would send jobs out to specified nodes.
If I decide to move forward with a clustering solution, my instinct
would be to go with OpenSSI at this time. This would be from standpoint
of stability, robustness, shared filesystems and development team. But
this is just my bias without anything well documented at this time to
support it. There was an excellent paper written on Single System Images
comparing (openmosix, openssi and Kerrighed)
http://www.irisa.fr/paris/Biblio/Papers/Lottiaux/LotBoiGalValMor05CCGrid.pdf
The latter is a clustering solution out of a French research group,
though I am not sure if it available in any kind of stable release, (I
have not checked).
I am happy to see that you and Francesco at least have made some real
efforts to use OpenMosix in a production setting. It would be great to
hear about some of your experiences thus far. I think that the real
opportunity of these clustering systems is as follows:
1. single process space across nodes
2. shared filesystems that do locking/caching correctly (ie more
stable than NFS)
There are many others, but these are the main features that come to mind.
Best,
-Jack de Valpine
Lars O. Grobe wrote:
> Hi!
>
>> I agree with Greg, and I think you can launch rpict -ro directly
>> on your "master" node and wait for automatic migration,
>> or use the "runon" or "migrate" scripts to move the jobs to your
>> preferred nodes, if necessary.
>
> I usually start all jobs on my local node and let them get migrated.
> This will usually take some minutes, but as the rendering time for a
> picture is better described in weeks than days at the moment, the
> startup time is not important as longs as the nodes do not run out of
> memory. Also I think I should write a small how-to, as this way of
> distributing renderings works really nice (as long as the network is
> stable, else the ssh or former rsh way is more fault tolerant).
>
> I started the rpict processes just the same way ranimate would do, as
> far as I know rpict -ro will find out which view to use from the
> viewfile as this should be containes in the image header, right? So
> the command is
>
> nohup /opt/openmosix/bin/mosrun -c /opt/radiance/bin/rpict
> @stills/render.opt -w0 -ro stills/frame003.unf scene_illum.oct &
>
> for the third frame. In fact, it is amazing again and again how
> powerful these small little radiance tools are, e.g. that I can
> recover that easily...
>
> One other question, did anyone use openssi with radiance? It should
> even allow to use rpict with shared memory (-PP), but I could not
> install it so far because it like to live in it's own network, and my
> machines have to integrate in an existing network.
>
> CU Lars.
> ------------------------------------------------------------------------
>
> _______________________________________________
> Radiance-general mailing list
> Radiance-general at radiance-online.org
> http://www.radiance-online.org/mailman/listinfo/radiance-general
>
--
# Jack de Valpine
# president
#
# visarc incorporated
# http://www.visarc.com
#
# channeling technology for superior design and construction
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://radiance-online.org/pipermail/radiance-general/attachments/20060219/21b16eec/attachment.html
More information about the Radiance-general
mailing list