[Radiance-general] "Broken pipe" message from rpiece on multi-core Linux system

Randolph M. Fritz RFritz at lbl.gov
Tue Apr 10 22:27:58 PDT 2012


Thanks Jack, Greg.

Jack, what kernel were you using?  Was it also Linux?

Greg, I was using rad, so those delays are already in there, alas.  I 
wonder if there is some subtle difference between the Mac OS Mach 
kernel and the Linux kernel that's causing the problem, or if it occurs 
on all platforms, just more frequently in the very fast cluster nodes.

Or, it could be an NFS locking problem, bah.

If I find time, maybe I can dig into it some more.  Right now, I may 
just finesse it by running multiple *different* simulations on the same 
cluster node.

Randolph

On 2012-04-09 21:52:47 +0000, Greg Ward said:

> If it is a startup issue as Jack suggests, you might try inserting a 
> few seconds of delay between the spawning of each new rpiece process 
> using "sleep 5" or similar.  This allows time for the sync file to be 
> updated without contention between processes.  This is what I do in rad 
> with the -N option.  I actually wait 10 seconds between each new rpiece 
> process.
> 
> This isn't to say that I understand the source of your error, which 
> still puzzles me.
> 
> -Greg
>>>> From: Jack de Valpine <jedev at visarc.com>
>>> Date: April 9, 2012 1:46:03 PM PDT
>> 
> Hey Randolph,
> 
> I have run into this before. Unfortunately I have had limited success 
> in tracking down the issue and also have not really looked at it for 
> some time. If I recall correctly, a couple of things that I have 
> noticed:
>>>> 	•	possible problem if a piece finishes before the first set of pieces 
>>>> are parcelled out out by rpiece - so if it 8 pieces are being 
>>>> distributed at startup and piece 2 (for example) finishes before one of 
>>>> pieces 1, 3, 4, 5, 6, 7, 8 has even been processed by rpiece or while 
>>>> rpiece is still forking off the initial jobs.
> Sorry I cannot offer more, I have spent some time in the code on this 
> one and it is not for the faint of heart to say the least. 
> -Jack
> -- 
> # Jack de Valpine
> # president
> #
> # visarc incorporated
> # http://www.visarc.com
> #
> # channeling technology for superior design and construction
> 
> On 4/9/2012 3:29 PM, Randolph M. Fritz wrote:
> This problem is back for a sequel, and it would really help my work if 
> I could get it going. 
> 
> It's been a few months since I last asked about this.  Has anyone else 
> experienced this in a Linux environment?  Anyone have any ideas what to 
> do about it or how to debug it? 
> 
> /proc/version reports: 
>  Linux version 2.6.18-274.18.1.el5 
> (mockbuild at builder10.centos.org) (gcc 
> version 4.1.2 20080704 (Red Hat 4.1.2-51)) #1 SMP Thu Feb 9 12:45:44 
> EST 2012 
> 
> Randolph 
> 
> On 2011-07-08 01:13:01 +0000, Randolph M. Fritz said: 
> 
> On 2011-07-07 16:54:06 -0700, Greg Ward said: 
> 
> Hi Randolph, 
> 
> This shouldn't happen, unless one of the rpict processes died 
> unexpectedly.  Even then, I would expect some other kind of error to be 
> reported as well. 
> 
> -Greg 
> 
> Thanks, Greg.  I think that's what happenned; in fact seven of the 
> eight died in two cases.  Wierdly, the third succeeded.  If I run it as 
> a single-processor job, it works.  Here's a piece of the log: 
> 
> rpiece -F bl_blinds_rpsync.txt -PP pfLF5M90 -vtv -vp 60.0 -2.0 66.0 -vd 
> 12.0 0.0 0.0 -vu 0 0 1 -vh 60 -x 1024 -y 1024 -dp 512 -ar 42 -ms 3.6 
> -ds .3 -dt .1 -dc .5 -dr 1 -ss 1 -st .1 -af bl.amb -aa .1 -ad 1536 -as 
> 392 -av 10 10 10 -lr 8 -lw 1e-4 -ps 6 -pt .08 -o bl_blinds.unf bl.oct 
> 
> rpict: warning - no output produced 
> 
> rpict: system - write error in io_process: Broken pipe 
> rpict: 0 rays, 0.00% after 0.000u 0.000s 0.001r hours on n0065.lr1 
> rad: error rendering view blinds 
> 
> 
> _______________________________________________
> Radiance-general mailing list
> Radiance-general at radiance-online.org
> http://www.radiance-online.org/mailman/listinfo/radiance-general
> _______________________________________________
> Radiance-general mailing list
> Radiance-general at radiance-online.org
> http://www.radiance-online.org/mailman/listinfo/radiance-general


-- 
Randolph M. Fritz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.radiance-online.org/pipermail/radiance-general/attachments/20120410/bdb663e6/attachment.html>


More information about the Radiance-general mailing list