Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Fri 06 Dec 2019, 07:47
All times are UTC - 4
 Forum index » Off-Topic Area » Programming
DNS Caching - speed up your browsing by as much as 2000%
Post new topic   Reply to topic View previous topic :: View next topic
Page 1 of 1 [7 Posts]  
Author Message
technosaurus


Joined: 18 May 2008
Posts: 4872
Location: Blue Springs, MO

PostPosted: Wed 03 Dec 2014, 19:48    Post subject:  DNS Caching - speed up your browsing by as much as 2000%  

When you navigate to a web page, before any http request is sent to the web server (ex port 80 on www.example.com) a DNS query must be done on a separate server (typically your ISP's default set via DHCP). It is not uncommon for this process to take upwards of a whole second to complete. Once that is done and the HTTP get request is sent to the actual web server, your browser will parse the index.html (or whatever) and then require the same process for every single js, css, images, embedded media, etc...

It's no wonder why pages take so long to load if each one of these things takes up to 1s. Simply switching my DNS server from my ISP's to Google's brought the time down from 500+ms to ~25ms, but that is still 1s (not including the download time) for a page with 80 additional resources. Most browsers will use the OS builtin caching for repeated access to the same host, but Linux (glibc) does not have it (android does, but it is buggy) There are also daemons like nscd that are also buggy.

Checking glibc, musl and uclibc - none of their libresolv (or libc if it is integrated) have any builtin caching, so it is up to us to provide it. The DNS caching alone will make a lot of improvement, but the ip addresses come in no particular order, so it would be good to sort them by response time (in cases like CDNs that have many IP addresses that may be on the other side of the internet) These 2 little things could provide significant improvement in user experience for desktop users.

But enough B.S. I'll shut up and show you the code.
Code:
#include <netinet/in.h>
#include <fcntl.h>
#include <string.h>

int get_value(const char *path, void *buf,size_t len){
   int fd = open(path, O_RDONLY);
   if (fd<0) return fd;
   len=read(fd,buf,len);
   close(fd);
   return len;
}

int set_value(const char *path, void *buf,size_t len){
   int fd = open(path, O_CREAT|O_WRONLY|O_TRUNC);
   if (fd<0) return fd;
   len=write(fd,buf,len);
   close(fd);
   return len;
}

uint32_t query_ip(char *host, uint32_t dns){
   unsigned char buf[4096]={0}, *bufp=buf,
      *hp=(unsigned char *)"\0\0" "\x01\0" "\0\x01" "\0\0" "\0\0" "\0\0";
   struct sockaddr_in dest = { //CN 0x72727272 RU 0x3E4C4C3E US2 0x08080808
      .sin_family=AF_INET, .sin_port=htons(53), .sin_addr.s_addr=dns
   };
   uint32_t i, j, ans, ip=0, destsz=sizeof(struct sockaddr_in);
   int   s=socket(AF_INET , SOCK_DGRAM , IPPROTO_UDP);
   if (s<0) goto IPV4END;
   for(i=0;i<12;i++) *bufp++=*hp++; //copy header
   i=j=0;
   do{ /* convert www.example.com to 3www7example3com */
      if(host[i]=='.' || !host[i]){ //could use strchrnul() here instead
         *bufp++ = i-j;
         for(;j<i;j++)
            *bufp++=host[j];
         ++j;
      }
   }while(host[i++]);
   *bufp++='\0';
   if (!(bufp-buf)&1) *bufp++='\0'; //pad my shorts... is it sexxy?
   *(bufp++)=0; *(bufp++)=1; *(bufp++)=0; *(bufp++)=1; //extra Q fields
   i=sendto(s, buf, bufp-buf, 0, (struct sockaddr*)&dest, destsz);
   if (i < 0) goto IPV4END;
   i=recvfrom(s,buf,sizeof(buf),0,(struct sockaddr*)&dest,(socklen_t*)&destsz);
   if (i < 0) goto IPV4END;
   for(i=0;i<buf[7];i++){ //[7] holds num of answers([6] does too but >256?)
      while(*bufp) ++bufp; //skip names
      ans=bufp[1]; //[1] holds the answer type ([0] does too, but >256???)
      bufp += 10;
      if(ans == 1){ uint32_t j=4; // ipv4 address
         unsigned char *ipp=(unsigned char *)&ip;
         while(j--) *ipp++=*bufp++;
         goto IPV4END; //temporary hack ... todo write all of them to cache
      }else while(*bufp) ++bufp; //todo read these and make hard links
   }
IPV4END:
   close(s);
   return ip;
}

//This is a wrapper around query_ip, and most of it should eventually move there.
//TODO app to select optimal DNS server and generate the /*/.hosts/dns
//TODO actually read alias host names and make them into hardlinks
//TODO set the TTL by adding it to current time and changing the files mod time.
// ... thus if current time > modified time, do another query
// ... only add a short period of time to "not found" in case its offline
//TODO write separate app/daemon to sort the IP entries from fastest to slowest
// ... a daemon could use an inotify watch on the */.hosts/ directory
uint32_t host2ip(char *host){
   char path[254+sizeof("/etc/.hosts/")]; //using /etc for now -> /tmp ???
   uint32_t dns[2]={0x04020204, 0x08080808}, ip=0;
   int res;
   strcpy(path,"/etc/.hosts/");
   if (mkdir(path) == 0){
      uint32_t dns[] = {0x04020204, 0x08080808,0};
      res = set_value("/etc/.hosts/" "dns", &dns, sizeof(dns));
   }else res = get_value("/etc/.hosts/" "dns", &dns, sizeof(dns));
   strncat(path,host,sizeof(path));
   if (get_value(path,&ip,sizeof(ip)) < 0){
      if((ip=query_ip(host,dns[0])) || (ip=query_ip(host,dns[1])))
      set_value(path,&ip,sizeof(ip));
   }
   return ip;
}

#ifdef TEST
#include <stdio.h> //printf ... adds ~16k on static musl builds
int main( int argc ,char **argv){
   if (argc < 2) return 1;
   in_addr_t ip=host2ip(argv[1]);
   if (!ip){
      perror("host2ip");
      return 1;
   }
   printf("%d.%d.%d.%d\n",((unsigned char*)&ip)[0],((unsigned char*)&ip)[1],((unsigned char*)&ip)[2],((unsigned char*)&ip)[3]);
   return 0;
}
#endif


Edit: Linux has syscalls for request_key, add_key and keyctl and can be configured with a builtin dns_resolver that calls /sbin/request-key, so I will probably use this in my libc.h implementation to make the small, fast path internal and the slower method handled by the external binary (which will probably evolve from the code above)

format:
"/sbin/request-key <op> <key> <uid> <gid> <keyring> <keyring> <keyring>"
see:
security/keys and net/dns_resolver and keyutils
This is now plausible after this commit for 3.18:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0b0a84154eff56913e91df29de5c3a03a0029e38

Edit 2: After some consideration, I came to the conclusion that using a filesytem based cache is just as efficient, easier to implemnetn and more portable than using kernel interaction.

_________________
Check out my github repositories. I may eventually get around to updating my blogspot.

Last edited by technosaurus on Mon 15 Dec 2014, 06:20; edited 1 time in total
Back to top
View user's profile Send private message Visit poster's website 
stemsee

Joined: 27 Jun 2013
Posts: 2539
Location: In The Way

PostPosted: Thu 04 Dec 2014, 05:56    Post subject:  

So, how to implement this script as it is with firefox 33.1 ?
Back to top
View user's profile Send private message MSN Messenger 
technosaurus


Joined: 18 May 2008
Posts: 4872
Location: Blue Springs, MO

PostPosted: Thu 04 Dec 2014, 07:28    Post subject:  

It's not a script it is C code that would need to be patched into glibc for firefox to use it, but IIRC firefox has a builtin mechanism for caching that will sometimes foul up and give a no-longer-valid IP address for a CDN like ajax.googleapis.com, platform.twitter.com, (thus the occasional stalls)

Or you could patch the firefox source by replacing calls to getaddrinfo and gethostbyname with a wrapper around a modified version of this code.
See: https://github.com/mozilla/gecko-dev/blob/master/netwerk/dns/GetAddrInfo.cpp

What I have so far is just a proof of concept, but for lots of small downloads from the same host with a patched wget it makes a big difference. I'm still working on sorting the multiple retrieved IP addresses themselves by response time.

_________________
Check out my github repositories. I may eventually get around to updating my blogspot.
Back to top
View user's profile Send private message Visit poster's website 
technosaurus


Joined: 18 May 2008
Posts: 4872
Location: Blue Springs, MO

PostPosted: Sat 13 Dec 2014, 01:46    Post subject:  

update:
here is an example downloader (compiles to ~5kb static binary) that simply writes the requested URL to stdout:

Code:
#include <netinet/in.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>

//note: you should check if offset==sizeof(buf) after (buf limit is INT_MAX)
#define strcpyALL(buf, offset, ...) do{ \
   char *bp=(char*)(buf+offset); \
   const char *s, \
      *a[] = { __VA_ARGS__,NULL}, \
      **ss=a; \
   while((s=*ss++)) \
      while((*s)&&(++offset<(int)sizeof(buf))) \
         *bp++=*s++; \
   if (offset!=sizeof(buf))*bp=0; \
}while(0)

char buf[4096]={0};

static inline uint32_t query_ip(const char *host){
   char *bufp=buf,
      *hp=(char *)"\0\0" "\x01\0" "\0\x01" "\0\0" "\0\0" "\0\0";
   struct sockaddr_in dest = {
      .sin_family=AF_INET, .sin_port=htons(53), .sin_addr.s_addr=0x04020204
   };
   uint32_t ans, ip=0;
   int   s=socket(AF_INET , SOCK_DGRAM , IPPROTO_UDP);
   while(bufp-buf<12) *bufp++=*hp++; //copy header
   int i=0,j=0;
   do{ /* convert www.example.com to 3www7example3com */
      if(host[i]=='.' || !host[i]){ //could use strchrnul() here instead
         *bufp++ = i-j;
         for(;j<i;j++)
            *bufp++=host[j];
         ++j;
      }
   }while(host[i++]);
   *bufp++='\0';
   if (!(bufp-buf)&1) *bufp++='\0';
   *(bufp++)=0; *(bufp++)=1; *(bufp++)=0; *(bufp++)=1; //extra Q fields
   if(( connect(s, (struct sockaddr*)&dest,sizeof(dest))) != 0 ) goto IPV4END;
   if((write(s, buf, bufp-buf))<0) goto IPV4END;
   if((read(s,buf,sizeof(buf)))<0) goto IPV4END;
   for(i=0;i<buf[7];i++){ //[7] holds num of answers([6] does too but >256?)
      while(*bufp) ++bufp; //skip names
      ans=bufp[1]; //[1] holds the answer type ([0] does too, but >256???)
      bufp += 10;
      if(ans == 1){ uint32_t j=4; // ipv4 address
         char *ipp=(char *)&ip;
         while(j--) *ipp++=*bufp++;
         goto IPV4END;
      }else while(*bufp) ++bufp; //skip (alias) names
   }
IPV4END:
   close(s);
   return ip;
}

static inline void get(const char *host, const char *path){
   struct sockaddr_in dest = {
      .sin_family=AF_INET,.sin_port=htons(80),.sin_addr.s_addr=query_ip(host)
   };
   int len=0, sz=sizeof(dest), s=socket(AF_INET,SOCK_STREAM,IPPROTO_TCP);
   if(( connect(s, (struct sockaddr*)&dest,sz)) != 0 ) goto GETEND;
   strcpyALL(buf,len,"GET ",path," HTTP/1.0\nHost: ",host,"\n\n");
   if((write(s,buf,len))<0) goto GETEND;
   len=read(s,buf,sizeof(buf));
   if (len<0) goto GETEND;
   else{
      char *bp=strstr(buf,"\r\n\r\n");
      if (bp==NULL) goto GETEND;
      bp+=4;
      len-=(bp-buf);
      write(1,bp,len);
      while ((len=read(s,buf,sizeof(buf)))>0)
         write(1,buf,len);
   }
GETEND:   
   close(s);
}

int main(int argc, const char **argv){
   if (argc<3) return 1;
   get(argv[1],argv[2]);
   return 0;
}



And an example of how to store the cache on disk (multiple IP addresses temporarily removed since last version)
Code:
#include <netinet/in.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>
#include <sys/stat.h>
#include <utime.h>
#include <time.h>

#define HOSTDIR "/etc/.hosts/" //maybe "/tmp/.hosts/" ???

static int get_value(const char *path, void *buf,size_t len){
   int fd = open(path, O_RDONLY);
   if (fd<0) return fd;
   len=read(fd,buf,len);
   close(fd);
   return len;
}
static int set_value(const char *path, void *buf,size_t len){
   int fd = open(path, O_CREAT|O_WRONLY|O_TRUNC);
   if (fd<0) return fd;
   len=write(fd,buf,len);
   close(fd);
   return len;
}

uint32_t query_ip(char *host, uint32_t *ip){
   unsigned char buf[4096]={0}, *bufp=buf,
      *hp=(unsigned char *)"\0\0" "\x01\0" "\0\x01" "\0\0" "\0\0" "\0\0";
   struct sockaddr_in dest = {
      .sin_family=AF_INET, .sin_port=htons(53), .sin_addr.s_addr=*ip
   };
   uint32_t ttl=0, i, j, ans;
   int   s=socket(AF_INET , SOCK_DGRAM , IPPROTO_UDP);
   if (s<0) goto IPV4END;
   for(i=0;i<12;i++) *bufp++=*hp++; //copy header
   i=j=*ip=0;
   do{ /* convert www.example.com to 3www7example3com */
      if(host[i]=='.' || !host[i]){ //could use strchrnul() here instead
         *bufp++ = i-j;
         for(;j<i;j++)
            *bufp++=host[j];
         ++j;
      }
   }while(host[i++]);
   *bufp++='\0';
   if (!(bufp-buf)&1) *bufp++='\0';
   *(bufp++)=0; *(bufp++)=1; *(bufp++)=0; *(bufp++)=1; //extra Q fields
   if((connect(s, (struct sockaddr*)&dest,sizeof(dest))) != 0 ) goto IPV4END;
   if((write(s, buf, bufp-buf))<0) goto IPV4END;   
   if((read(s,buf,sizeof(buf)))<0) goto IPV4END;
   for(i=0;i<buf[7];i++){ //[7] holds num of answers([6] does too but >256?)
      while(*bufp) ++bufp; //skip names
      ans=bufp[1]; //[1] holds the answer type ([0] does too, but >256???)
      if(ans == 1){
         ttl=ntohl(*(int32_t *)&bufp[4]);
         *ip=*((int32_t *)&bufp[10]);
         if (*ip) goto IPV4END;
      }else bufp+=10+bufp[9]; //skip (alias) names TODO make hard links
   }
IPV4END:
   close(s);
   return ttl;
}


uint32_t host2ip(char *host){
   char path[254+sizeof(HOSTDIR)];
   uint32_t ip=0x04020204;
   struct stat st;

   strcpy(path,HOSTDIR);
   if (mkdir(path,0644) == 0){ // initiallize
      set_value(HOSTDIR "dns", &ip, sizeof(ip));
   }else get_value(HOSTDIR "dns", &ip, sizeof(ip));
   strncat(path,host,sizeof(path));
   //check if file modified date is in the future (our TTL hack)
   time_t t=time(&t);
   if ((stat(path,&st)!=-1)&&(t<st.st_mtime)){
      if ((get_value(path,&ip,sizeof(ip))!=-1))
         return ip;
   }
//move this block into query_ip
   uint16_t ttl=query_ip(host,&ip); //TTL is stored in dns
   if (ip){
      struct utimbuf ut={.actime=st.st_atime, .modtime=ttl+t};
      set_value(path,&ip,sizeof(ip));
      utime(path,&ut);
   }
   return ip;
}

#ifdef TEST
#include <stdio.h> //printf ... adds ~16k on static musl builds
int main( int argc ,char **argv){
   if (argc < 2) return 1;
   in_addr_t ip=host2ip(argv[1]);
   if (!ip){
      perror("host2ip");
      return 1;
   }
   printf("%d.%d.%d.%d\n",((unsigned char*)&ip)[0],((unsigned char*)&ip)[1],((unsigned char*)&ip)[2],((unsigned char*)&ip)[3]);
   return 0;
}
#endif

_________________
Check out my github repositories. I may eventually get around to updating my blogspot.
Back to top
View user's profile Send private message Visit poster's website 
catsezmoo

Joined: 09 Feb 2014
Posts: 26

PostPosted: Sun 21 Dec 2014, 20:51    Post subject:  

Quote:
Or you could patch the firefox source
Patched or unpatched, have you ever successfully built a recent version of firefox?
Asking because I'm uncertain that my system (2.8GHz Core 2 Duo, 4Gb ram) could handle it
Back to top
View user's profile Send private message 
technosaurus


Joined: 18 May 2008
Posts: 4872
Location: Blue Springs, MO

PostPosted: Mon 22 Dec 2014, 00:54    Post subject:  

If the libc is patched to do its own caching like this, you can just disable mozilla's builtin caching

I may end up making a patch for musl-libc (+ maybe glibc) after I get some of the kinks worked out. The parts that do this for glibc are in libresolv, which on fatdog is 71kb (about 1/6th the size of the entire musl libc which includes libresolv) It would make sense to include the simplified bits in libc for getting the cached results (~6 lines of code) or call an external binary to fill it and do a timed out wait with select().

Firefox's builtin caching breaks all the time for static resources on dynamic CDNs (It tries to use a no-longer-valid IP), This is why you occasionally get a perpetual "Waiting for ..."

If the functionality is shifted to an external binary, then you can do some nifty little tricks after you write the file for the initial response to the current query. For example, you could ping all of the addresses simultaneously to sort them by response time (using multiple threads)... afterwards it could walk the directory to see if there are any entries to purge/update.

_________________
Check out my github repositories. I may eventually get around to updating my blogspot.
Back to top
View user's profile Send private message Visit poster's website 
slavvo67

Joined: 12 Oct 2012
Posts: 1616
Location: The other Mr. 305

PostPosted: Tue 06 Dec 2016, 19:05    Post subject:  

Speeding up browsing is always good. I hope you move forward with this for everyone's benefit. Probably a bit beyond my current level but if you need a tester or anything, I'd be willing to help. Currently running Seamonkey, Firefox and Iron.

Slavvo67
Back to top
View user's profile Send private message 
Display posts from previous:   Sort by:   
Page 1 of 1 [7 Posts]  
Post new topic   Reply to topic View previous topic :: View next topic
 Forum index » Off-Topic Area » Programming
Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.0679s ][ Queries: 11 (0.0086s) ][ GZIP on ]