Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Sun 21 Dec 2014, 03:08
All times are UTC - 4
 Forum index » Off-Topic Area » Programming
thread_saver
Post new topic   Reply to topic View previous topic :: View next topic
Page 1 of 2 [26 Posts]   Goto page: 1, 2 Next
Author Message
big_bass

Joined: 13 Aug 2007
Posts: 1747

PostPosted: Sun 18 Dec 2011, 22:27    Post subject:  thread_saver
Subject description: save forum threads
 

thinking about the forum down time and its good to backup stuff before problems occur

*Mu and seaside had the idea for this and seaside used gtkdialog *
http://www.murga-linux.com/puppy/viewtopic.php?t=62236&search_id=1975402649

I wanted to do the same thing but I prefer Xdialog

and I wanted to test the three input option so I rewrote the GUI part in Xdialog

Major update
Updated and modified a lot to be easier for big threads
it makes a folder dates it , renames and renumbers the files
so that its easier on the browser to scroll quickly

12-29-2011
added long names and a filter to clean poorly formatted names

with the end goal from here files could be read and edited quickly
you can filter out the unneeded posts

Code:

#!/bin/sh

# thread_saver big_bass completely rewritten to the basics and using Xdialog 
# 12-29-2011
# added date to folder rewritten the download part and the naming
# of the files and numbering 

# original idea was based on
# ThreadGet Seaside 11-24-2010 (Based on Mu's Fetchforum basic program)
# For use on  phpBB forums
# update 3-29-2011 --add startpage, append to existing files


#------------------------------------------------
SEL=`Xdialog \
        --title "thread_saver" \
        --separator "\n" --stdout \
        --3inputsbox  "thread_saver" 0 0 \
            "URL (dowload this link)" "$1" \
            "end page number" "$2" \
            "name the html file" "$3"`

# lets get the three values in 3 separate  arrays
SEL_ARRARY=($SEL)

THREAD=${SEL_ARRARY[0]}
NPAGE=${SEL_ARRARY[1]}
NAME=${SEL_ARRARY[@]:2:20}

# NAME get long names start at the third string
# and count 20 words as the max length

# rename badly formatted files that have spaces and symbols
NAME_FIXED=`echo  "$NAME" | tr  ';"<>,+!@#$?%^*&(){}[]' ' ' | tr -s ' ' '_*'`
#------------------------------------------------

add_date=`date "+%m"-"%d"-"%y"`
URL_NAME=`basename $THREAD`
mkdir -p /root/Forum_Threads/${NAME_FIXED}"-folder-"${add_date}
cd /root/Forum_Threads/${NAME_FIXED}"-folder-"${add_date}


# zero start count fix
NPAGE=($NPAGE-1)
let ALL_POSTS=15*$NPAGE

# simplified the renaming renumbering code a lot  big_bass
for i in $(seq 0 15 $ALL_POSTS); do
 
   let N_ADJ=($i/15)
 
   wget -N "$THREAD&start=$i"
   mv "$URL_NAME&start=$i" "${NAME_FIXED}""_""$N_ADJ"
   echo "$URL_NAME&start=$i" "${NAME_FIXED}""_""$N_ADJ"
done



Xdialog --title "done" \
       --msgbox "Thread downloaded to /root/Forum_Threads " 0 0 3000



thread_saver.png
 Description   
 Filesize   12.73 KB
 Viewed   877 Time(s)

thread_saver.png


_________________
debian wheezy ,linux mint, slackware I use them all and they all have good points
Mint would be best for general users though

Last edited by big_bass on Thu 29 Dec 2011, 11:45; edited 12 times in total
Back to top
View user's profile Send private message 
puppyluvr


Joined: 06 Jan 2008
Posts: 3229
Location: Chickasha Oklahoma

PostPosted: Mon 19 Dec 2011, 00:55    Post subject:  

Very Happy Hello,
Thanks.... Cool
I was thinking along the same lines, but never got that far...
Sure beats 1 pg @ a time... Rolling Eyes

Works great, and all on 1 page...
Nice, and useful..

All right now everyone Backup your threads... Wink

@Edit..
Works on puppylinux.info too!!

_________________
Close the Windows, and open your eyes, to a whole new world
http://puppylinuxstuff.meownplanet.net/puppyluvr/
Puppy Linux Users Group on Facebook

Puppy since 2.15CE...
Back to top
View user's profile Send private message Visit poster's website 
jpeps

Joined: 31 May 2008
Posts: 3220

PostPosted: Mon 19 Dec 2011, 03:22    Post subject: Re: thread_saver
Subject description: save forum threads
 

big_bass wrote:


Code:

# lets get the three values in 3 separate  arrays
cat /tmp/downloader-info | tr '|' '\n' >/tmp/downloader-info2
select_array=(`cat /tmp/downloader-info2`)
echo ${select_array[0]}
echo ${select_array[1]}
echo ${select_array[2]}

THREAD=${select_array[0]}
NPAGE=${select_array[1]}
NAME=${select_array[2]}
#------------------------------------------------



another way:

Code:

var="$(cat /tmp/downloader-info)"
IFS="|"
set -- $var
THREAD="$1"
NPAGE="$2"
NAME="$3"
Back to top
View user's profile Send private message 
big_bass

Joined: 13 Aug 2007
Posts: 1747

PostPosted: Mon 19 Dec 2011, 11:23    Post subject:  

I went the longer way to avoid using

Code:
IFS="|"
set --


because you have to unset the IFS
for any code that follows
because the pipe is frequently used command

I changed the --separator "|" already in Xdialog only
*and the URL has quite a few "/// " to filter
since the default in Xdialog uses those too

the main point expressed was how to use the three inputs in Xdialog
it reduced all that gtkdialog code to just a few lines

Joe

_________________
debian wheezy ,linux mint, slackware I use them all and they all have good points
Mint would be best for general users though
Back to top
View user's profile Send private message 
jpeps

Joined: 31 May 2008
Posts: 3220

PostPosted: Mon 19 Dec 2011, 12:13    Post subject:  

big_bass wrote:


because you have to unset the IFS
for any code that follows
because the pipe is frequently used command


perhaps setting the --separator to " " ? Just an alternative...works fine the way you have it Smile

Code:

var="$(cat /tmp/downloader-info)"
 set -- $var
THREAD="$1"
NPAGE="$2"
NAME="$3"
Back to top
View user's profile Send private message 
big_bass

Joined: 13 Aug 2007
Posts: 1747

PostPosted: Mon 19 Dec 2011, 12:36    Post subject:  

Major update
Updated and modified a lot to be easier for big threads
it makes a folder and renames and renumbers the files
so that its easier on the browser to scroll quickly

with the end goal from here files could be read and edited quickly
you can filter out the unneeded posts

Joe

jpeps
Quote:

perhaps setting the --separator to " " ? Just an alternative...works fine the way you have it

I will look at that part again thanks still an alfa version *I want to combine some other html tools I wrote with this


Code:
#!/bin/sh

# thread_saver big_bass re written the GUI to use Xdialog to test the three input option
# 12-19-2011

# ThreadGet Seaside 11-24-2010 (Based on Mu's Fetchforum basic program)
# For use on  phpBB forums
# update 3-29-2011 --add startpage, append to existing files


#------------------------------------------------
Xdialog --separator "|" --3inputsbox  "Big Thread downloader" 0 0 "URL (dowload this link)" "$1"  "end page number" "$2" "name the html file" "$3" 2> /tmp/downloader-info

# lets get the three values in 3 separate  arrays
cat /tmp/downloader-info | tr '|' '\n' >/tmp/downloader-info2
select_array=(`cat /tmp/downloader-info2`)
echo ${select_array[0]}
echo ${select_array[1]}
echo ${select_array[2]}

THREAD=${select_array[0]}
NPAGE=${select_array[1]}
NAME=${select_array[2]}
#------------------------------------------------

n=$((NPAGE*15-15))
mkdir -p /root/Forum_Threads/
mkdir -p /tmp/Forum_Threads
cd /tmp/Forum_Threads
if [[ $SP =~ ^[0-9]+$ ]]; then
SP=$((SP*15-15))
else
SP=0
fi

# simplified big_bass
for i in $(seq $SP 15 $n) ; do
wget -O $(printf "%04d" $i) -c "$THREAD&start=$i"
done


# file renumber and rename
cd /tmp/Forum_Threads
START_NUMBER=1

NUM=0
ls -1 | sort -n >/tmp/list_forum_pages.txt
for i in `cat /tmp/list_forum_pages.txt`
do

    echo "renumber file --> $NUM"
    mv /tmp/Forum_Threads/$i /tmp/Forum_Threads/$NAME"_"$START_NUMBER.htm
    let NUM=$NUM+1
    let START_NUMBER=$START_NUMBER+1
done

mkdir -p /root/Forum_Threads/$NAME

mv $NAME* /root/Forum_Threads/$NAME

rm -r  /tmp/Forum_Threads/
rm -f  /tmp/downloader-info
rm -f  /tmp/downloader-info2
rm -f  /tmp/list_forum_pages.txt

Xdialog --title "done" \
       --msgbox "Thread downloaded to /root/Forum_Threads " 0 0 3000

_________________
debian wheezy ,linux mint, slackware I use them all and they all have good points
Mint would be best for general users though
Back to top
View user's profile Send private message 
aarf

Joined: 30 Aug 2007
Posts: 3620
Location: around the bend

PostPosted: Mon 19 Dec 2011, 15:31    Post subject:  

i you can name the downloaded-page-thread by pulling the title from the <title>title</title>
then it can be automated to get many threads.
i have been waiting for years for a gui that produce the necessary code to do matching. my short term memory quickly forgets the stanyx and i have to start from scratch if i want to do this matching stuff. sorry.

_________________

ASUS EeePC Flare series 1025C 4x Intel Atom N2800 @ 1.86GHz RAM 2063MB 800x600p ATA 320G
_-¤-_

<º))))><.¸¸.•´¯`•.#.•´¯`•.¸¸. ><((((º>
Back to top
View user's profile Send private message Visit poster's website 
big_bass

Joined: 13 Aug 2007
Posts: 1747

PostPosted: Mon 19 Dec 2011, 15:49    Post subject:  

aarf
its doable but ... one example here you see all the spaces in the names it can be done it just needs some
adjustments some conditioning to be a "correct" file name ... hey no problem that's easy to do
<title>Puppy Linux Discussion Forum :: View topic - Classic Pup 2.14X -- Updated 2 series</title>


Joe

_________________
debian wheezy ,linux mint, slackware I use them all and they all have good points
Mint would be best for general users though
Back to top
View user's profile Send private message 
aarf

Joined: 30 Aug 2007
Posts: 3620
Location: around the bend

PostPosted: Mon 19 Dec 2011, 16:05    Post subject:  

big_bass wrote:
aarf
its doable but ... one example here you see all the spaces in the names it can be done it just needs some
adjustments some conditioning to be a "correct" file name ... hey no problem that's easy to do
<title>Puppy Linux Discussion Forum :: View topic - Classic Pup 2.14X -- Updated 2 series</title>


Joe
ok i think that as well as the relevant title bits, a date from the first post could also feature in the name that will eliminate duplicate names and make it easier to reference. go further and also add the date of the last post and it will be easier to do new backups. something like
1.nov.2010 classic pup 2.14X -- Updated 2 series 20.nov.2011.htm. or whatever date format is easy to search or order in time. possibly also include the original thread number in the name also for future backup reference.

(i'll pull my request for ,mht image containing files for now, till it progresses further. Wink )

_________________

ASUS EeePC Flare series 1025C 4x Intel Atom N2800 @ 1.86GHz RAM 2063MB 800x600p ATA 320G
_-¤-_

<º))))><.¸¸.•´¯`•.#.•´¯`•.¸¸. ><((((º>
Back to top
View user's profile Send private message Visit poster's website 
seaside

Joined: 11 Apr 2007
Posts: 888

PostPosted: Mon 19 Dec 2011, 16:15    Post subject:  

big_bass,

Nice work. (It just shows what can happen when someone who knows what they're doing gets a hold on things) Smile

Also the file i/o could be eliminated-
Code:
SEL=`Xdialog --separator "|" --stdout --3inputsbox  "Thread downloader" 0 0 "URL (dowload this link)" "$1"  "end page number" "$2" "name the html file" "$3"`

THREAD=`echo "$SEL" | cut -f1 -d'|'`
NPAGE=`echo "$SEL" | cut -f2 -d'|'`
NAME=`echo "$SEL" | cut -f3 -d'|'`


For the life of me, I can't remember why I thought wget had to be run in a terminal for this to work Smile

Regards,
s
Back to top
View user's profile Send private message 
aarf

Joined: 30 Aug 2007
Posts: 3620
Location: around the bend

PostPosted: Mon 19 Dec 2011, 16:19    Post subject:  

probably would be a good idea to pop over to phpbb devs forum and check to see we're not re-inventing the wheel.
_________________

ASUS EeePC Flare series 1025C 4x Intel Atom N2800 @ 1.86GHz RAM 2063MB 800x600p ATA 320G
_-¤-_

<º))))><.¸¸.•´¯`•.#.•´¯`•.¸¸. ><((((º>
Back to top
View user's profile Send private message Visit poster's website 
big_bass

Joined: 13 Aug 2007
Posts: 1747

PostPosted: Mon 19 Dec 2011, 23:39    Post subject:  

Hey seaside

you did a great job
and I also took your suggestion about the
code snippet you posted today and used it thanks
I got a little 'array happy' in that part
*its a habit to for me to use extra output files when testing
so I can debug stuff quickly Xdialog either works or it doesnt
no good error messages but its mostly easy


@hey jpeps you had a good code snippet too and worked
but I went with seasides

@aarf I have to do some heavy testing with the auto naming part before I add it
some people even included slashes ,back slashes , spaces and other symbols in the file names that doesnt play nicely with making directories

updated main post with seasides suggested
shortened code snippet

Joe

_________________
debian wheezy ,linux mint, slackware I use them all and they all have good points
Mint would be best for general users though
Back to top
View user's profile Send private message 
aarf

Joined: 30 Aug 2007
Posts: 3620
Location: around the bend

PostPosted: Tue 20 Dec 2011, 02:27    Post subject:  

big_bass wrote:

@aarf I have to do some heavy testing with the auto naming part before I add it
some people even included slashes ,back slashes , spaces and other symbols in the file names that doesnt play nicely with making directories



Joe

there has got to be a ready made code snippet that does the job, just a matter of knowing where to find it
perhaps autoname them to their thread number for now. still useful and unique and google search can still be used to find content.

_________________

ASUS EeePC Flare series 1025C 4x Intel Atom N2800 @ 1.86GHz RAM 2063MB 800x600p ATA 320G
_-¤-_

<º))))><.¸¸.•´¯`•.#.•´¯`•.¸¸. ><((((º>
Back to top
View user's profile Send private message Visit poster's website 
jpeps

Joined: 31 May 2008
Posts: 3220

PostPosted: Tue 20 Dec 2011, 04:14    Post subject:  

big_bass wrote:


@hey jpeps you had a good code snippet too and worked
but I went with seasides


I agree....so
Code:

SEL=`Xdialog --separator " " --stdout --3inputsbox  "Thread downloader" 0 0 "URL (download this link)" "$1"  "end page number" "$2" "name the html file" "$3"`


set -- $SEL

THREAD="$1"
NPAGE="$2"
NAME="$3"
Back to top
View user's profile Send private message 
aarf

Joined: 30 Aug 2007
Posts: 3620
Location: around the bend

PostPosted: Tue 20 Dec 2011, 09:28    Post subject:  

aarf wrote:
big_bass wrote:

@aarf I have to do some heavy testing with the auto naming part before I add it
some people even included slashes ,back slashes , spaces and other symbols in the file names that doesnt play nicely with making directories



Joe

there has got to be a ready made code snippet that does the job, just a matter of knowing where to find it
perhaps autoname them to their thread number for now. still useful and unique and google search can still be used to find content.

naming to their thread number wont need any matching at all. it will be simple to just replace their name in the code by the number of the step variable. should be ready to start already. (but am not in thinking mode at present. Laughing ) will need a test for empty downloands so page number wouldnt be needed, or will need to match and thus get the number of pages number from the first page.

_________________

ASUS EeePC Flare series 1025C 4x Intel Atom N2800 @ 1.86GHz RAM 2063MB 800x600p ATA 320G
_-¤-_

<º))))><.¸¸.•´¯`•.#.•´¯`•.¸¸. ><((((º>
Back to top
View user's profile Send private message Visit poster's website 
Display posts from previous:   Sort by:   
Page 1 of 2 [26 Posts]   Goto page: 1, 2 Next
Post new topic   Reply to topic View previous topic :: View next topic
 Forum index » Off-Topic Area » Programming
Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.1060s ][ Queries: 12 (0.0038s) ][ GZIP on ]