From decot@hpisod2.HP.COM Sun May 6 02:15:28 1990 Newsgroups: comp.unix.wizards Organization: Hewlett Packard, Cupertino From: decot@hpisod2.HP.COM (Dave Decot) Subject: Re: Novice question concerning setpgrp()/tcsetpgrp() Date: 4 May 90 01:17:45 GMT The functions setpgrp() and tcsetpgrp() should not be used in the same program: the meaning of setpgrp() varies from system to system, and thus that routine in not defined in POSIX.1. However, tcsetpgrp() was invented by POSIX.1, so has a portable meaning (when it is available). I assume that since tcsetpgrp() is available on your system, that it is close to a POSIX.1 conforming system. If not, the following do not apply since the functionality was invented by POSIX.1: If you are trying to start a new session (or System V.3-style process group), use setsid() (on a System V.3 or earlier system, this is what setpgrp() does, sort of). It is very unlikely that you want to do this at the point you discussed in your shell. If you are trying to start a new BSD-style process group (aka, a separate "job"), use setpgid(). It is likely that this is what you are trying to do. If you are trying to change the process group of the terminal, use tcsetpgrp(). This could also be useful in the shell fragment you are converting. Dave Decot From dkeisen@Gang-of-Four.Stanford.EDU Wed Apr 3 13:27:45 1991 From: dkeisen@Gang-of-Four.Stanford.EDU (Dave Eisen) Newsgroups: comp.unix.internals Subject: Re: TIOCNOTTY ioctl under Sys5 Date: 3 Apr 91 16:05:29 GMT Organization: Sequoia Peripherals In article <2177@estevax.UUCP> iain@estevax.UUCP (Hr Iain Lea) writes: >I am converting a piece of BSD code to Sys5 and would like >to know if there is a ioctl call that has the same function >as TIOCNOTTY ( void tty ) ? > No there isn't. But there is a way of accomplishing the same thing as this ioctl does, namely getting rid of the process group's control terminal. In BSD, one usually calls sepgrp to remove the process and its descendants >from the process group of its parent and then one uses this ioctl to make sure this process group doesn't have a control terminal. In System V, the setpgrp does both --- it changes the process group id of the process to match its pid and it disassociates the process from its controlling terminal. The only remaining problem is that the first time this process opens a terminal device it will reacquire a control terminal, namely the terminal it just opened. The way to make sure that it won't reacquire a terminal is to fork, have the parent exit, and do all your work in the child. If a process does not have a control terminal and if its pid doesn't match its process group id (i.e., it isn't a process group leader), it can't acquire a control terminal. So replace the BSD code: int fd; setpgrp (0, getpid ()); if ((fd = open ("/dev/tty", O_RDWR)) != -1) { ioctl (fd, TIOCNOTTY, (void *) NULL); close (fd); } with the System V code: setpgrp (); switch (fork ()) { case -1: /* handle error somehow */; break; case 0 : break; default: exit (0); } -- Dave Eisen 1101 San Antonio Rd. Suite 102 Mountain View, CA 94043 (415) 967-5644 dkeisen@Gang-of-Four.Stanford.EDU (for now) From karels@bsdi.com (Mike Karels) Path: uunet!vixen.cso.uiuc.edu!gateway Newsgroups: info.bsdi.users Subject: Re: Anybody having trouble with processes not exiting on a dialin line? Date: 2 Nov 94 18:07:44 GMT Organization: University of Illinois at Urbana Lines: 85 Approved: Usenet@ux1.cso.uiuc.edu Message-ID: <199411021807.MAA03096@redrock.BSDI.COM> NNTP-Posting-Host: ux1.cso.uiuc.edu Originator: daemon@ux1.cso.uiuc.edu Xref: uunet info.bsdi.users:7530 This subject comes up periodically. Although I don't know of a bug in the 1.1 system that would cause this (kernel, sh or csh, at least), it is probably worth explaining the mechanisms so that folks can figure out what is going on (and how it is supposed to work). It works differently in current BSD (starting with Net/2 or so), older BSD, and System V. Assume this scenario: telnet/rlogin/tty login_csh% # login csh is process group A login_csh% prog1 & # prog1 is background pgrp B login_csh% prog2 & # prog2 is background pgrp C login_csh% stop %2 # stop pgrp C login_csh% csh # new csh is pgrp D csh_D% prog3 # prog3 is pgrp E ... # prog3/pgrp E is in foreground login_csh is the controlling process/session leader for the login session. It might be started by an rlogind or telnetd on a pty, or exec'ed by getty on a regular tty. According to login_csh, the foreground process is csh_D. csh_D has put prog3 in foreground. The tty's foreground process group is E. More complicated scenarios include a program like vi, then a shell escape, or other programs that in turn spawn a shell. Now assume that the telnetd/rlogind detect a disconnect, or that the normal tty loses carrier. In POSIX, this results in a SIGHUP to the process group of the controlling process (login_csh, pgrp A). In older systems, the HUP would have gone to the tty's foreground pgrp (E). If the controlling process is csh, it does a couple of things. One is that it sends a SIGHUP to its foreground process group (E). Next, if this csh was not started in its own process group, it would have created its own pgrp; if so, it returns to its original process group, and puts that pgrp into the foreground on the tty (as it was when it was started). It then exits. Note that these actions are different in 1.1 than in 1.0, and a bug in csh prevented the SIGHUP from being sent to the foreground pgrp (E). When the controlling process exits, that causes revocation of the tty. All current open descriptors become attached to a "dead" vnode rather than the tty, and the tty is closed. Subsequent reads and writes will return -1 with errno EIO. (Buglet: reads should return end-of-file.) Also, a SIGHUP is sent to the tty's foreground process group. In this case, that is E, and prog3 gets a HUP. Also, when the exit of a process causes another process group to become orphaned (here, pgrp C), and any memory of that group is stopped, the orphaned process group is sent a SIGHUP and a SIGCONT. prog2 would be able to continue if it caught SIGHUP, but it would no longer have tty access. We left csh_D with a pending SIGHUP. It does similar things to login_csh, sending HUP to its foreground process group (E, this time redundantly) and then exiting. If there had been a longer chain of job-control shells, note that we depend on each of them to pass on the SIGHUP to the next one. If this did not happen, things should still get cleaned up, but in a different way. Because the tty's foreground process group gets a SIGHUP, the foreground process will normally exit. Its parent shell would then read from the tty, and find that it could not. It would exit, its parent should do the same, etc. However, if any of these programs (including the original foreground pgrp members) do something foolish, such as looping when a read returns -1, it won't go away. Note that none of this procedure caused a SIGHUP to pgrp B, prog1. Background processes are not signalled by BSD. prog1 would no longer be able to read or write the tty, and could not be stopped by a tty-related stop signal. If the original situation was that the user did a logout from the login shell, the situation would be similar except that there would be no other foreground process group. Other processes in the session would either be in background or stopped. One final comment: even if the above fails at some step, any program that loops when it cannot read from the tty is broken. Ideally, that bug would be masked by having the program killed by SIGHUP. However, if the program catches SIGHUP, or it is in background when the login shell exits, the program will have to handle the situation intelligently. There is nothing that BSDI can do to prevent broken programs from looping. Mike Article 7542 of info.bsdi.users: Path: uunet!spool.mu.edu!howland.reston.ans.net!vixen.cso.uiuc.edu!gateway From: kre@munnari.OZ.AU (Robert Elz) Newsgroups: info.bsdi.users Subject: Re: Anybody having trouble with processes not exiting on a dialin line? Date: 3 Nov 94 03:08:03 GMT Organization: University of Illinois at Urbana Lines: 36 Approved: Usenet@ux1.cso.uiuc.edu Message-ID: <11869.783832097@munnari.OZ.AU> References: <199411021807.MAA03096@redrock.BSDI.COM> NNTP-Posting-Host: ux1.cso.uiuc.edu Originator: daemon@ux1.cso.uiuc.edu Xref: uunet info.bsdi.users:7542 Date: Wed, 02 Nov 1994 12:07:58 -0600 From: Mike Karels Message-ID: <199411021807.MAA03096@redrock.BSDI.COM> Mike's exposition on POSIX pgrp/SIGHUP, etc, is excellent, and will certainly go into my library of useful information to keep - I'd never really attempted to understand how posix intended job control to work, just that it was much more complex than it used to be... I also suspect that he has noted the most likely cause for the loops people are seeing (I just received my first bsdi cdrom late last week, and am still waiting for something to run it on...) is this ... Subsequent reads and writes will return -1 with errno EIO. (Buglet: reads should return end-of-file.) That "buglet" is the likely cause - many programs that know they are reading a terminal, and (sometimes) also know that if reads work (ie: a read has worked in the past), then that no real I/O errors are possible - any -1 returns from "read" indicate things like EINTR (on non bsd systems) and are promptly ignored. (A hangup will return EOF, not -1, and a hangup is all that can happen to a terminal, it can't develop "bad blocks" or similar, even parity errors don't generate read errors.) That's really a program bug, it should be checking for "ignorable" errors, and ignoring only those, but its also very common. Correcting that "buglet" would probably allow most of those problem programs to go back to "working", which might be a very good reason not to correct it. kre Article 7573 of info.bsdi.users: Path: uunet!vixen.cso.uiuc.edu!gateway From: paul@vix.com (Paul A Vixie) Newsgroups: info.bsdi.users Subject: Re: Anybody having trouble with processes not exiting on a dialin line? Date: 4 Nov 94 09:00:52 GMT Organization: Vixie Enterprises Lines: 51 Approved: Usenet@ux1.cso.uiuc.edu Message-ID: References: <199411021807.MAA03096@redrock.BSDI.COM> NNTP-Posting-Host: ux1.cso.uiuc.edu Originator: daemon@ux1.cso.uiuc.edu Xref: uunet info.bsdi.users:7573 > than the tty, and the tty is closed. Subsequent reads and writes > will return -1 with errno EIO. (Buglet: reads should return end-of-file.) Aha! That's it! On my systems, it's almost always telnet that spins when the session dies abnormally. I'm about to check jove, which is the other main culprit. In this patch, I #ifdef'd it for __bsdi__ even though the EIO case should probably be treated the same as EOF on _any_ system. Somebody else can decide, I know telnet still has a champion somewhere. *** 1.1 1994/11/04 08:41:03 --- 1.1/usr.bin/telnet/sys_bsd.c 1994/11/04 08:42:48 *************** *** 1134,1137 **** --- 1134,1141 ---- FD_CLR(tin, &ibits); c = TerminalRead(ttyiring.supply, ring_empty_consecutive(&ttyiring)); + #ifdef __bsdi__ + if (c < 0 && errno == EIO) + c = 0; + #endif if (c < 0 && errno == EWOULDBLOCK) { c = 0; Now for jove. As you can see I have my own opinions on this one; the code was just wrong the way it was. *** 1.1 1994/11/04 08:56:35 --- 1.1/contrib/jove/jove/jove.c 1994/11/04 08:58:37 *************** *** 291,294 **** --- 291,301 ---- reads &= ~01; nfds -= 1; + /* + * EOF or EIO when select() set our + * bit means Something Bad Happened. + */ + if (nchars == 0 || + (nchars < 0 && errno == EIO)) + break; } while (nfds--) { -- Paul Vixie La Honda, CA decwrl!vixie!paul Article 7592 of info.bsdi.users: Path: uunet!vixen.cso.uiuc.edu!gateway From: karels@bsdi.com (Mike Karels) Newsgroups: info.bsdi.users Subject: Re: Anybody having trouble with processes not exiting on a dialin line? Date: 4 Nov 94 16:22:13 GMT Organization: University of Illinois at Urbana Lines: 29 Approved: Usenet@ux1.cso.uiuc.edu Message-ID: <199411041622.KAA01027@redrock.BSDI.COM> References: NNTP-Posting-Host: ux1.cso.uiuc.edu Originator: daemon@ux1.cso.uiuc.edu Xref: uunet info.bsdi.users:7592 > > than the tty, and the tty is closed. Subsequent reads and writes > > will return -1 with errno EIO. (Buglet: reads should return end-of-file.) > Aha! That's it! > On my systems, it's almost always telnet that spins when the session dies > abnormally. I'm about to check jove, which is the other main culprit. > In this patch, I #ifdef'd it for __bsdi__ even though the EIO case should > probably be treated the same as EOF on _any_ system. Somebody else can > decide, I know telnet still has a champion somewhere. To repeat my earlier assertion, any program that loops if it gets a read error on stdin is broken. POSIX provides for the EIO error under other circumstances (reading from the tty in background, and either the process group is orphaned, or the process is ignoring or blocking SIGTTIN). After a terminal disconnect, POSIX allows either the EIO error or an EOF. Thus, I don't think the #ifdef should be present. I forwarded this fix to the current Telnet champion. I looked at changing this to do an EOF a while back. There is code to do the right thing in dead_read, but it doesn't work. The vnode type has been changed from VCHR to VBAD by vgone(). I asked Kirk McKusick about changing vgone, and changing the various tests for VBAD to check the vnode ops instead. He said it wasn't that easy, and I haven't pursued it. Another possibility is to add a flag to the vnode. Mike