BarryK wrote:Hmm, I thought that we had sorted that problem out. When you run from the menu, it runs /usr/sbin/pppoe_gui, and I recall we discussed the resolve.conf problem sometime ago and supposedly solved it.
Anyone want to volunteer to study that script and find out what has gone wrong?
Okay, you have a volunteer.
This bug came up in the Users forum last week. Looking around a bit, I noticed that this bug has been coming up, off and on, for a number of years. So I thought I'd take a look to see if I could figure out why this bug seems to be so squash-resistant, and keeps coming back to haunt new users.
Because of the long history of this bug, I thought that I should first study that history a bit to get the big picture, rather than just look at the suspect scripts and perhaps propose a solution that might have already been tried and eventually failed.
So I did a lot of reading. I downloaded some early Puppies, examined release notes, poured over Barry's old news and new blogs, and studied the various related symptoms reported by forum members over the years, as well as the specific actions that led to those symptoms.
My old eyes probably read
way too many posts. They were full of reports about good connections but which had "Address Not Found" errors while browsing, connections that had worked fine at one time but not after using PPPoE, and, of course, the smoking guns: circular links and "Too many levels of symbolic links". (For my convenience while studying them, I made a list of a couple of dozen threads where this issue has come up in the past, including links to them. Rather than now discarding that list, I will append it to the end of this post in case anyone else wants to dig into this a little more.)
The result is that I learned a lot (including stuff unrelated to this bug), and I have a new appreciation for how hard it must be to provide users with a wide choice of networking applications while ensuring that those apps play well with each other. And, along the way, I may have even (re)discovered the solution to this problem.
My apologies if this post is a little too long and boring, but if it helps us to finally lay this bug to rest permanently, perhaps it is worth it. (If not, you are welcome to keep a copy of this by your bedside for those nights when you need a little help getting to sleep.)
BACKGROUND - - -
One of the options available when connecting via PPP is the ability to obtain the primary and secondary DNS nameservers from your Internet Service Provider (ISP), as described in
RFC 1877. We don't need to know the details of the RFC because the PPP daemon, pppd, has handled that task for us since 1999. When given the option
usepeerdns, pppd will ask the ISP server for the nameservers, and if the server provides them it will place them in the environment variables $DNS1 and $DNS2 and also create or overwrite /etc/ppp/resolv.conf with appropriate nameserver lines.
Of course those lines aren't very useful hiding in /etc/ppp/resolv.conf. They need to find their way to /etc/resolv.conf. The standard way of getting them into /etc/resolv.conf is by including appropriate code in the /etc/ppp/ip-up script. That script is run when pppd successfully completes the connection and has written to /etc/ppp/resolv.conf. The pppd package includes sample code for doing this in scripts/ip-up.local.add.
PPP IN EARLY PUPPIES - - -
I'm fairly new to the Puppy world, and the oldest Puppy I've found is 1.0.4, so the early history of Puppy is not clear to me, but based on Barry's /etc/ppp/README.txt file in various Puppies, I think the following may be a fair picture of how things were done in pre-1.0.4 Puppies. (I'll welcome corrections from old-timers if this doesn't reflect reality.)
In June of 2003, a utility named "ppp" was used to connect via PPP. This utility generated an /etc/ppp/ip-up script. I have not been able to locate a copy of this utility, but a copy of its documentation, the /usr/share/doc/pppsetup.txt file (dated 2001-06-12) still lingers in Puppy 1.0.4. That file says that the ip-up script will update /etc/resolv.conf if it obtains a nameserver address.
(I am guessing that the script it generated might be the same as the file still included in Puppy 4.3.1 at /etc/ppp/ip-up-EXAMPLE. That script, if it finds the nameservers in $DNS1 and $DNS2, creates or overwrites /etc/resolv.conf with appropriate nameserver lines. Of course, it is just example code now, and doesn't actually get run in Puppy 4.3.1.)
In September of 2003, Barry added Gkdial. In /etc/ppp/README.txt, his comments indicate that the ip-up script is still being used (and he may edit it to eliminate an xmessage which is redundant for Gkdial).
Barry's next comment in /etc/ppp/README.txt, in November of 2004, indicates that Gkdial only writes to /etc/ppp/resolv.conf, not /etc/resolv.conf. So it would appear that either ip-up was no longer being used, or the code in it that updated /etc/resolv.conf had been removed. In the same comment he mentions that /etc/ppp/resolv.conf is a link to /etc/resolv.conf. So the current method of using this link to update /etc/resolv.conf dates from at least 2004. This way the pppd actually writes directly to /etc/resolv.conf, so that file is updated even if there is no code in ip-up (or even no ip-up) to do the update.
I've looked through Barry's news items from 2003 and 2004, and there is no mention of the removal of, or changes to, ip-up. My best guess is that the script was never specifically included on Puppy, since it was always created when "ppp -s" was run to setup the configuration. And if the user now chose to run Gkdial and never ran ppp, the script never got created, and so Gkdial didn't update /etc/resolv.conf until Barry added the link.
This link apparently worked well with the other utilities used in early Puppies for connecting via PPP. Except for ppp, many (or all?) of those utilities had no built-in mechanism for updating /etc/resolv.conf, and the symbolic link took care of that.
THE PROBLEM FIRST APPEARS - - -
Then along came Roaring Penguin. Roaring Penguin
does have a built-in mechanism for updating /etc/resolv.conf, which it uses if the user chooses to obtain the DNS addresses from their ISP server. Unfortunately, Roaring Penguin's way of doing things conflicted with Puppy's way of doing things. While Puppy makes /etc/ppp/resolv.conf a symlink to /etc/resolv.conf, so that when pppd writes to /etc/ppp/resolv.conf, it actually writes to /etc/resolv.conf, Roaring Penguin takes the opposite approach and uses a symlink from /etc/resolv.conf to /etc/ppp/resolv.conf so that when the resolver reads from /etc/resolv.conf it actually reads from /etc/ppp/resolv.conf. And thus sometimes we get into the situation when
both are symlinks and
nobody is happy.
The standard work-around was to, first, remove both symlinks. And, second, run pppoe-setup (formerly known as adsl-setup) again or simply create /etc/resolv.conf manually. (The downside of this was that if the user made a new /etc/resolv.conf, or ran pppoe-setup and didn't choose to obtain the DNS address from the ISP's server, there would be no linkage between the two resolv.conf files, and using any of Puppy's other PPP utilities would no longer automatically update /etc/resolv.conf, as they previously did. Of course, that was rarely a problem, since ISPs don't change their nameservers very often, but it could be confusing when they did, if the user expected the update to happen automatically as it was supposed to.)
On November 22, 2004, Barry put a comment in /etc/ppp/README.txt to tell Roaring Penguin users to rename /etc/ppp/resolv.conf to hide the symlink, and he pointed-out that doing so would interfere with the operation of Gkdial.
FIXED IN PUPPY 1.0.7 - - -
Before releasing Puppy 1.0.7 in December 2005, Barry edited Roaring Penguin's pppoe-connect script to disable the lines which made /etc/resolv.conf a symlink to /etc/ppp/resolv.conf. This appears to have solved the problem, as the forums became quiet on this issue for a couple of years.
REAPPEARS IN 3.0.1 - - -
Apparently that change was overlooked in October of 2007 when the Roaring Penguin pet was upgraded from 3.7 to 3.8, so those lines were no longer disabled and the problem they had caused reappeared. I've seen nothing to indicate that disabling those lines had caused a problem, nor that they were restored intentionally. I think it was just an oversight.
NEW FIX TRIED FOR PUPPY 4.0.0 - - -
In April of 2008, before releasing Puppy 4.0, Barry put a warning in pppoe-setup which alerted users to the fact that choosing "server" to obtain the DNS addresses from their ISP server would probably not meet with success. If the user takes the advice to "just press the ENTER key", then when pppoe-connect is run it will not make the symlink, and all is well.
Unfortunately, there is also a suggestion to first try entering "server" to see if it will work, and run the setup script again if it doesn't. If the user does that, the symlink will be created when pppoe-connect is run, and any attempt by pppoe-setup to fix it will fail because of the "Too many levels of symbolic links" error. The link needs to be removed manually.
And that is how it still was as of Puppy 4.3.1.
SUGGESTIONS - - -
My suggestion is to restore Barry's 2005 fix to pppoe-connect by changing these two lines from this:
Code: Select all
rm -f /etc/resolv.conf
ln -s /etc/ppp/resolv.conf /etc/resolv.conf
to this:
Code: Select all
#BK rm -f /etc/resolv.conf
#BK ln -s /etc/ppp/resolv.conf /etc/resolv.conf
Then Barry's 2008 change to pppoe-setup could be removed. That should be enough to fix this problem. Roaring Penguin will still work without that symlink because Puppy has the symlink at /etc/ppp/resolv.conf.
It might also be worth considering eliminating the symlink at /etc/ppp/resolv.conf and, instead, using ip-up to update /etc/resolv.conf, as was the done in the very first Puppies. This is apparently the way the developers of pppd intended things to be (they provide sample code for the ip-up script), and it is probably more bullet-proof because it avoids the need for other apps to set up symlinks and get into a symlink shoot-out.
In writing an ip-up script, it would be worth looking at the sample code provided in the scripts/ip-up.local.add in the pppd package. Although the code in the ip-up generated by the ppp utility (which still exists as /etc/ppp/ip-up-EXAMPLE) apparently worked okay, the sample code from the pppd package will actually make a copy of the original /etc/resolv.conf which can be restored by the ip-down script after pppd closes the connection. Thus, if a user had an alternative method of connecting to the Internet (e.g., the wireless access point at the local library) which needed a different /etc/resolv.conf, that file would not be blown-away after using PPP.
On the other hand, we know that the symlink at /etc/ppp/resolv.conf works. Adding an ip-up script might not be worth the additional testing that would be necessary. (On the third hand, if Puppy had continued to use ip-up all along, the symlink conflict would never have happened, and you might now be doing something more fun than reading this long, boring post. So perhaps ip-up could avert another conflict down the road when some new PPP utility didn't play by the same rules as the existing ones. (But (on the
fourth hand?) with more and more households installing networks with routers that handle all of the PPP stuff, there may not be many new PPP utilities added to Puppy.))
If Puppy used ip-up, and the symlink at /etc/ppp/resolv.conf was removed, the symlink that Roaring Penguin places at /etc/resolv.conf might not cause a problem anymore. However, I think that Barry's 2005 fix should still be used. First of all, there still might be a way Roaring Penguin's symlink could cause a conflict with some future app. But also, consider this:
People sometimes have trouble connecting to the Internet. (Yes, I know it is hard to believe, but it is true!) When the obvious method of connecting doesn't work, people will try alternatives. Someone who has DSL, but connects via a router may properly choose "Internet by network or wireless LAN..." from the Internet Connection Wizard. And something might go wrong. So perhaps he remembers that the connection to his ISP uses PPPoE (and he also wants to see just what the heck a "Roaring Penguin" is), so he clicks on that button, and he goes through the set-up dialog and tries to connect. The Roaring Penguin proceeds to blow-away the existing /etc/resolv.conf, and replaces it with a link. Because PPPoE will not work for connecting to his router, no connection is made, and so no nameservers ever get added to any resolv.conf. Now no method of connecting will work.
My guess is that this is also one of the various reasons why people who once had everything working sometimes get into the state where they seem to be able to connect okay, but browsing now always results in "Address Not Found".
Note that the configuration file, /etc/ppp/pppoe.conf, supplied with Puppy 4.3.1 (and probably others) sets DNSTYPE=SERVER, so all a user has to do to get into this situation is:
(WARNING: You don't actually want to do this.)
Menu -> Network -> Roaring Penguin PPPoE -> Connect now: START
or
connect -> Internet by 'Roaring Penguin' PPPOE... -> Connect now: START
SUMMARY - - -
It is my opinion that the Roaring Penguins made a bad choice when they chose to link /etc/resolv.conf to /etc/ppp/resolv.conf. Barry's 2005 fix is a good one and should eliminate the current problem. Creating an ip-up script doesn't seem to be currently necessary, but if folks think that it would be worth the effort and risk, it might make for a more robust Puppy when confronted with future PPP apps, if any.
ADDENDUM - - -
In researching this issue, I did a lot of reading of old forum posts related to it. (Both the forum and Barry's daily notes are wonderful resources to have available.) Anyway, for my convenience I made a list of a couple of dozen of the threads, with HTML links to them. Rather than just tossing it out now I offer it here in case anyone else wants to look into the issue as well. (Or use it as a sleep aid, if this post wasn't enough.)
I also made a partial time-line of events in Puppy's early history related to PPP. I hate throwing things away, so I'll add it here in case it might be helpful to someone else.
Okay, now I'm off to file a report on another PPP-related bug that I stumbled over when researching this one. Hopefully that report will be shorter than this one.
--npierce