Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Wed 16 Apr 2014, 10:44
All times are UTC - 4
 Forum index » Advanced Topics » Puppy Projects
Ever wanted to download a manual or a wiki from the web
Moderators: Flash, JohnMurga
Post new topic   Reply to topic View previous topic :: View next topic
Page 1 of 1 [10 Posts]  
Author Message
gposil


Joined: 06 Apr 2009
Posts: 1305
Location: Stanthorpe (The Granite Belt), QLD, Australia

PostPosted: Sun 26 Apr 2009, 06:26    Post subject:  Ever wanted to download a manual or a wiki from the web
Subject description: I have this should help
 

PMirrorget is page grabber...you point it at the path or index page you want and it grabs the text links to that page and downloads them to your local machine...with thanks to Lobster for the original to work off.

.
pmwget.jpg
 Description   Screen Shot
 Filesize   21.55 KB
 Viewed   1245 Time(s)

pmwget.jpg

PMirrorget-0.1.pet
Description  version 0.1
pet

 Download 
Filename  PMirrorget-0.1.pet 
Filesize  1.41 KB 
Downloaded  425 Time(s) 

_________________
Dpup Home
Back to top
View user's profile Send private message Visit poster's website MSN Messenger 
Lobster
Official Crustacean


Joined: 04 May 2005
Posts: 15117
Location: Paradox Realm

PostPosted: Sun 26 Apr 2009, 08:22    Post subject:  

Very useful Smile Well done.
Many thanks for using pwget, which is in Puppy and maybe this could be combined with that - or should they be separate?
Hope this finds its way into Puppy

Anyway I used your program to download a website for local/offline viewing. Works as advertised Wink

Have enclosed the code (find it at user/bin/pmwget) to give people an indication of how simple GTK + a command line utility such as wget can be . . .
Code:
#! /bin/bash

# Pmwget created by gposil with thanks to Lobster for Pwget
# April 2009 GPL v3 License
# http://gposil.netne.net

export HELP_DIALOG='
<window title="PMirrorget - Help" resizable="false">
  <vbox>
    <text>
      <label>PMirrorget allows you to download an entire web page and its text linked pages to a folder on you PC.Copy and paste the URL you wish to download. Use the folder selector to choose the destination. It is designed primarily for grabbing manuals and wiki pages without sifting through them, so you can view them later.</label>
    </text>
    <button>
      <label>Close</label>
      <action type="closewindow">HELP_DIALOG</action>
    </button>
  </vbox>
  </window>
'

export Pmwget='
<window title="PMirrorget - Site Grabber Utility" resizable="false">
<vbox>
 <hbox>
  <text><label>Copy and Paste or type the URL of the required site into "URL". Choose your destination folder and then "Grab It Now!"</label></text>
 </hbox>
 <frame>
 <hbox>
  <text><label>URL:    </label></text>
  <entry accept="directory"><variable>SOURCE</variable><input>/tmp/pm_source_dir</input></entry>
 </hbox>
 <hbox>
  <text><label>Folder:</label></text>
  <entry accept="directory"><variable>DEST</variable><input>/tmp/pm_mirror_dir</input></entry>
  <button>
   <input file icon="gtk-open"></input>
   <action type="fileselect">DEST</action>
   <action>refresh:DEST</action>
  </button>
 </hbox>
 </frame>
 <hbox>
 <frame>
  <button help>
<action type="launch">HELP_DIALOG</action>
  </button>
  <button cancel></button>
  </frame>
  <button>
  <input file>/usr/share/mini-icons/mini.checkmark.xpm</input>
       <label>Grab It Now! </label>
       <action type="exit">OK</action>
  </button>

 </hbox>
</vbox>
</window>'

I=$IFS; IFS=""
for STATEMENTS in  $(gtkdialog3 --program=Pmwget --center); do
   eval $STATEMENTS
done
IFS=$I
if [ $EXIT = OK ]; then
  rxvt -name PMirrorget -bg "#F3F2DF" -e wget -m -c -r -np -P "$DEST" $SOURCE
  rox -d "$DEST"
fi

_________________
Puppy WIKI
Back to top
View user's profile Send private message Visit poster's website 
aarf

Joined: 30 Aug 2007
Posts: 3620
Location: around the bend

PostPosted: Sun 26 Apr 2009, 09:20    Post subject:  

this is just what i have been thinking of asking for. having used webzip.exe for windows quite a lot i feel your gui needs some extra fields to be competitive. eg link block for google ads etc., required file types, follow link depth and to where... , see the webzip appication for more ideas.
_________________

ASUS EeePC Flare series 1025C 4x Intel Atom N2800 @ 1.86GHz RAM 2063MB 800x600p ATA 320G
_-¤-_

<º))))><.¸¸.•´¯`•.#.•´¯`•.¸¸. ><((((º>
Back to top
View user's profile Send private message Visit poster's website 
hillside


Joined: 02 Sep 2007
Posts: 642
Location: Minnesota, USA. The frozen north.

PostPosted: Sun 26 Apr 2009, 14:23    Post subject:  

This made it very easy to get a quick backup of everything on my website.

Backup. I have to remember to do that now and then.
Back to top
View user's profile Send private message 
aarf

Joined: 30 Aug 2007
Posts: 3620
Location: around the bend

PostPosted: Fri 01 May 2009, 02:44    Post subject:  

for what it is worth,

here is the code I used to use when I was paying an arm and a leg for intenet access.
Code:
#!/bin/sh
rxvt -e wget -D dlist.txt -R js,css -E -H -k -p -i rawbookmarks.html


dlist.txt contans the urls that dont get downloaded
Code:
 http://xyz.freelogs.com
http://www.google-analytics.com
http://ads.bloomberg.com
http://pagead2.googlesyndication.com
http://us.js2.yimg.com
http://visit.webhosting.yahoo.com

and rawbookmarks.html the list of full urls of the sites to download.
I recall there are issues with line ending and seperators in both these list so you have to experiment.
ie. uses of blank spaces or carriage returns or etc. also issues with the actual file type of the list.
I think from memory that one of the terms removes images also but you will have to check the help text at wget to be sure.

i post this in the hope that it helps with the evolution of the website downloading GUI. to help in incorporation possible upgrades for new functions and fields.

_________________

ASUS EeePC Flare series 1025C 4x Intel Atom N2800 @ 1.86GHz RAM 2063MB 800x600p ATA 320G
_-¤-_

<º))))><.¸¸.•´¯`•.#.•´¯`•.¸¸. ><((((º>
Back to top
View user's profile Send private message Visit poster's website 
aarf

Joined: 30 Aug 2007
Posts: 3620
Location: around the bend

PostPosted: Fri 01 May 2009, 03:18    Post subject:  

Oh yes the rawbookmarks.html doesnt contain any
Code:
<html>
tags at all just straight list of urls. Also recall issues with special characters in the url strings.
_________________

ASUS EeePC Flare series 1025C 4x Intel Atom N2800 @ 1.86GHz RAM 2063MB 800x600p ATA 320G
_-¤-_

<º))))><.¸¸.•´¯`•.#.•´¯`•.¸¸. ><((((º>
Back to top
View user's profile Send private message Visit poster's website 
droope


Joined: 31 Jul 2008
Posts: 814
Location: Uruguay, Mercedes

PostPosted: Fri 01 May 2009, 11:23    Post subject:  

Somehting that would be usefull would be to have the possibility of choosing to DL only the links that are in between a particular div.

Maybe it's too complex, but i think it'd be really really usefull.

Cheers,
Droope
Back to top
View user's profile Send private message 
smokey01


Joined: 30 Dec 2006
Posts: 1792
Location: South Australia

PostPosted: Fri 21 May 2010, 23:01    Post subject:  

Is it possible to use pmwget to download a blog from google blogspot.

I can use their export facility but that only gives me an xml file which is of no use.

I want to be able to do an exact backup in html format. I want to be able to capture all the comments and photographs I had previously uploaded.

I have even added the -k parameter on the command line but it did not work either. I think the problem is username and password associated. I even tried adding these to the command line without success.

My desired outcome is to copy the entire blog onto a CD or DVD or maybe just to the HDD so it can be viewed as if it was on the web.

Anyone got any ideas.

Thanks

_________________
Puppy Software <-> Distros <-> Puppy Linux Tips
Back to top
View user's profile Send private message Visit poster's website 
smokey01


Joined: 30 Dec 2006
Posts: 1792
Location: South Australia

PostPosted: Fri 21 May 2010, 23:06    Post subject:  

I wonder if it has something to do with the security I have setup.

Only people I give access to can see it.

Maybe I need to make it public to download it.

_________________
Puppy Software <-> Distros <-> Puppy Linux Tips
Back to top
View user's profile Send private message Visit poster's website 
smokey01


Joined: 30 Dec 2006
Posts: 1792
Location: South Australia

PostPosted: Sat 22 May 2010, 05:21    Post subject:  

Ok, I made the site public and pmwget worked a treat.

It did not download the entire site as there were 6 pages. I had to download each page separately then do a little manual html linking between pages.

It didn't download the bigger photos either, just the thumbnails but if you click on the thumbnail when you are online it will display the larger photo. Not a bad compromise.

I didn't bother trying to make the comments work either although they were downloaded. It looked like too much trouble but all of the posts downloaded and displayed just like they were on the web.

Good outcome.

_________________
Puppy Software <-> Distros <-> Puppy Linux Tips
Back to top
View user's profile Send private message Visit poster's website 
Display posts from previous:   Sort by:   
Page 1 of 1 [10 Posts]  
Post new topic   Reply to topic View previous topic :: View next topic
 Forum index » Advanced Topics » Puppy Projects
Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.0653s ][ Queries: 13 (0.0044s) ][ GZIP on ]