Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Mon 28 Jul 2014, 16:37
All times are UTC - 4
 Forum index » Off-Topic Area » Programming
filesystem as a database using only shell
Post new topic   Reply to topic View previous topic :: View next topic
Page 1 of 2 [19 Posts]   Goto page: 1, 2 Next
Author Message
technosaurus


Joined: 18 May 2008
Posts: 4280

PostPosted: Sun 12 May 2013, 22:39    Post subject:  filesystem as a database using only shell  

I don't know why this has never been done before, but can't seem to find anything similar via google, so here is the template for a shell+filesystem based database (basically just started - in planning stages)

Code:
#!/bin/ash

#+-{ALL_DBS}
#  +-{DB}
#    +-{table}
#      +-0 set -- COL1 COL2 ... #header
#      +-1 COL1="" COL2="" ... #more records follow
#  +-transactions #used to undo

ALL_DBS="$HOME/.SHBASE"
DB="${ALL_DBS}/${1:-default}"
[ -d "$TABLE" ] && mkdir -p "$TABLE"

create_table(){
   mkdir -p "$ALL_DBS/$DB"
}

drop_table(){
   echo will hide the associated table by renaming to ".$table_name"
   echo this transaction will be recorded for undoing
}

delete_table(){
   echo will actually delete the table
   echo this is not a recorded transaction since it cannot be undone
}

insert_into(){
   echo will record transaction and add the new record
}

resync_db(){
   echo rewrites all records to remove superceded transactions
   echo also reinitializes transaction table when complete
   echo this will speed up new transactions at the cost of lost history
   echo it can also reduce the size of the DB to possibly fit in inodes
}

select(){
   echo will return the fields that match
}

undo_transaction(){
   echo this will eventually revert transaction $1
}

undo_transactions_since(){
   echo this will eventually revert all transactions since transaction $1
}

update(){
   echo transaction, then fields
}

usage(){
   echo $0 database action ...
}

where(){
   echo will return all fields where all $@ matches
}

_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send private message 
jpeps

Joined: 31 May 2008
Posts: 3220

PostPosted: Sun 12 May 2013, 23:34    Post subject:  

ACID compliant?
Back to top
View user's profile Send private message 
technosaurus


Joined: 18 May 2008
Posts: 4280

PostPosted: Mon 13 May 2013, 00:51    Post subject:  

That is the plan
I've considered using a file for each record with file 0 as the header that uses a format as:
COL1 column_two COL3
so that the order can be set using set -- < DB/table/0
each column must fit the requirements for a shell variable
to change the order just echo COL3 COL1 column_two >DB/table/0

then each record starting at 1000000000000 (up to 9 trillion records)
I already planned to allow an internal history for each record
COL1='hello' column_two='world' COL3='foo'
#transaction-1234
COL3="bar"

so to read the record it is sourced and you would get
. $DB/$TABLE/$recordnum
echo $COL1 $column_two $COL3
hello world bar

Note: this is slightly simplified as the actual variable names may come from the header file and require an eval echo type statement

but to prevent issues like bad userspace programs doing two transactions with a delay in between allowing a third transaction to take place in between, numeric values can be set as:
#transaction-1235
COL4=$((${COL4}-999))

which is what you would use to decrement a balance by 9.99
shell doesn't directly support float types, so integer only - could use awk or dc for float types (later version maybe)

_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send private message 
jpeps

Joined: 31 May 2008
Posts: 3220

PostPosted: Mon 13 May 2013, 02:43    Post subject:  

technosaurus wrote:
That is the plan


Shouldn't take too much more than a million hours of testing; at least that was the case for the sqlite folks. That's after the app is fully developed, of course.
I think Lobster was working on one a while back. Sqlite is already in base, so why not use it?

Shell doesn't even round; dc isn't much better.
Back to top
View user's profile Send private message 
technosaurus


Joined: 18 May 2008
Posts: 4280

PostPosted: Mon 13 May 2013, 04:33    Post subject:  

just trying to get something that will work for busybox-only host
I think by using the filesystem it can grow infinitely (limited only by filesystem) without slowing down much
I'm actually modelling it after sqlite, but even more simplified

I may end up using awk for calculations since it can be built into busybox and has a lot of good math functions
calc(){ awk "BEGIN{print $1}"; }

I have create, insert, update working

Code:
#!/bin/ash

#+-{ALL_DBS}
#  +-{DB}
#    +-{table}
#      +-.0 set -- COL1 COL2 ... #header
#      +-.trans #used to undo
#      +-1000000000000 COL1="" COL2="" ... #more records follow


ALL_DBS="$HOME/.SHBASE"

create_table(){      #makes table $2 in DB $1 with col for each extra arg
   mkdir -p "$TABLE"
   echo "COLS='$@'"   > "$TABLE/.0"
   echo 'LAST=999999999999' >> "$TABLE/.0"
}

databases(){      #lists all tables in DB $1
   for database in "$ALL_DBS/"*;do
      echo ${database##*/}
   done
}

drop_table(){      #hide $1=DB $2=table by renaming to ".table_name"
   echo this transaction will be recorded for undoing
   mv "$TABLE" "$DB/.${TABLE##*/}"
}

delete_table(){      #delete $1=DB $2=table
   rm -rf "$TABLE"
}

insert(){      #add new record to $DB/$TABLE 'args="entries"'
   . "$TABLE/.0"
   echo "LAST=$(($LAST+1))" >>"$TABLE/.0"
   echo "$@" >> "$TABLE/$(($LAST+1))"
}

resync_db(){
   echo rewrites all records to remove superceded transactions
   echo also reinitializes transaction table when complete
   echo this will speed up new transactions at the cost of lost history
   echo it can also reduce the size of the DB to possibly fit in inodes
}

select(){      #$1=DB $2=table $@ is ???
[ ! "$1" ] && . "$TABLE/.0" && set -- $COLS
[ ! "$RECORDS" ] && RECORDS="*"
#not finished - still thinking for this and where()
}

table_info(){
   . "$TABLE/.0"
   echo $COLS
}

tables(){      #lists all tables in DB
   for TABLE in "$DB/"*;do
      echo ${TABLE##*/}
   done
}

undo_trans(){
   echo this will eventually revert transaction $1
}

undo_since(){
   echo this will eventually revert all transactions since transaction $1
}

update(){       #update records with 'COL_NAME="some value"' ...
   for RECORD in $RECORDS; do
      [ ${RECORD##*/} -lt 1000000000000 ] && RECORD="${RECORD%/*}/$((${RECORD##*/}+1000000000000))"
      echo "$@" >> "$RECORD"
   done
}

usage(){
   echo "$0 [database] [table] [record(s)] action ..."
   echo "$FUNCS"
   exit
}

where(){
   echo will return all record numbers where all $@ matches
}

FUNCS=`while read LINE; do case "$LINE" in *"()""{"*)echo "$LINE";;esac;done < "$0"`
ACTIONS=`echo "$FUNCS"|while read LINE; do printf "${LINE%%(*}|";done`"quit"

while [ "$1" ];do
   eval "case \"$1\" in $ACTIONS)break;;esac"
   [ "$DB" ] && [ "$TABLE" ] && [ "$RECORDS" ] && RECORDS="$RECORDS $TABLE/$1"
   [ "$DB" ] && [ "$TABLE" ] && [ ! "$RECORDS" ] && RECORDS="$TABLE/$1"
   [ "$DB" ] && [ ! "$TABLE" ] && TABLE="$DB/$1"
   [ ! "$DB" ] && DB="$ALL_DBS/$1"
   shift
done
case " $DB $TABLE $RECORDS " in *"./"*)echo nice try;exit;;esac
[ ! "$1" ] || [ "$1" == "--help" ] || [ "$1" == "-h" ] && usage
$@

_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send private message 
sunburnt


Joined: 08 Jun 2005
Posts: 5010
Location: Arizona, U.S.A.

PostPosted: Mon 13 May 2013, 18:17    Post subject:  

I seem to recall years ago someone did a text file database.

Different from what you`re proposing, but maybe simpler?
Back to top
View user's profile Send private message 
seaside

Joined: 11 Apr 2007
Posts: 886

PostPosted: Mon 13 May 2013, 18:52    Post subject:  

technosaurus,

What? No awk.

I thought it was your favorite Very Happy

Look Mum! No database!

From someone who's been from spreadsheet to database
and back....

Regards,
s
Back to top
View user's profile Send private message 
technosaurus


Joined: 18 May 2008
Posts: 4280

PostPosted: Mon 13 May 2013, 20:36    Post subject:  

awk is perfect for small single file based database where each table is a file and records have a separator (see pet package "database" - could really use some awking) but it is not as good at loading external files on the fly (or I haven't figured out a good way). I wrote one a while back, but that was the one to throw away.

I actually put quite a bit of thought into this structuring.
By separating the records out into files and using the filesystem:
- time to get a record is reduced significantly in large databases
- multiple records can be accessed by separate process
- locking is handled by filesystem's file locks
- data fields can be of any size without large write penalties
- can run in a cgi script with busybox httpd
- it is easy to port (I typically prototype my C stuff in shell)
- no special tools to browse (just cd and ls or file manager)
- its easy to integrate with other applications
- allows encryption if the filesystem is encrypted
- allows compression if the filesystem supports it (BTRFS)
- allows extensibility if the filesystem supports it (NFS, GFS, 9p)
- allows simple backups and failover mechanisms (same as filesystem)
- you get last access/modify time for free for each record
- file based tools like inotifyd can do something when records change
- binary data is just a file name

here is the current hashing
Code:
#!/bin/ash

#+-{ALL_DBS}
#  +-{DB}
#    +-{table}
#      +-.0 set -- COL1 COL2 ... #header
#      +-.trans #used to undo
#      +-1000000000000 COL1="" COL2="" ... #more records follow


ALL_DBS="$HOME/.SHBASE"

create_table(){      #makes table $2 in DB $1 with col for each extra arg
   mkdir -p "$TABLE"
   echo "COLS='$@'"   > "$TABLE/.0"
   echo 'LAST=999999999999' >> "$TABLE/.0"
}

databases(){      #lists all tables in DB $1
   for database in "$ALL_DBS/"*;do
      echo ${database##*/}
   done
}

drop_table(){      #hide $1=DB $2=table by renaming to ".table_name"
   echo this transaction will be recorded for undoing
   mv "$TABLE" "$DB/.${TABLE##*/}"
}

delete_table(){      #delete $1=DB $2=table
   rm -rf "$TABLE"
}

insert(){      #add new record to $DB/$TABLE 'args="entries"'
   . "$TABLE/.0"
   echo "LAST=$(($LAST+1))" >>"$TABLE/.0"
   echo "$@" >> "$TABLE/$(($LAST+1))"
}

resync_db(){
   echo rewrites all records to remove superceded transactions
   echo also reinitializes transaction table when complete
   echo this will speed up new transactions at the cost of lost history
   echo it can also reduce the size of the DB to possibly fit in inodes
}

select(){      #$1=DB $2=table $@ is ???
[ ! "$1" ] && . "$TABLE/.0" && set -- $COLS
[ ! "$RECORDS" ] && RECORDS="*"
}

table_info(){
   . "$TABLE/.0"
   echo $COLS
}

tables(){      #lists all tables in DB
   for TABLE in "$DB/"*;do
      echo ${TABLE##*/}
   done
}

undo_trans(){
   echo this will eventually revert transaction $1
}

undo_since(){
   echo this will eventually revert all transactions since transaction $1
}

update(){       #update records with 'COL_NAME="some value"' ...
   for RECORD in $RECORDS; do
      [ ${RECORD##*/} -lt 1000000000000 ] && RECORD="${RECORD%/*}/$((${RECORD##*/}+1000000000000))"
      echo "$@" >> "$RECORD"
   done
}

usage(){
   echo "$0 [database] [table] [record(s)] action ..."
   echo "$FUNCS"
   exit
}

where(){
   echo will return all record numbers where all $@ matches
}

FUNCS=`while read LINE; do case "$LINE" in *"()""{"*)echo "$LINE";;esac;done < "$0"`
ACTIONS=`echo "$FUNCS"|while read LINE; do printf "${LINE%%(*}|";done`"quit"

while [ "$1" ];do
   eval "case \"$1\" in $ACTIONS)break;;esac"
   [ "$DB" ] && [ "$TABLE" ] && [ "$RECORDS" ] && RECORDS="$RECORDS $TABLE/$1"
   [ "$DB" ] && [ "$TABLE" ] && [ ! "$RECORDS" ] && RECORDS="$TABLE/$1"
   [ "$DB" ] && [ ! "$TABLE" ] && TABLE="$DB/$1"
   [ ! "$DB" ] && DB="$ALL_DBS/$1"
   shift
done
case " $DB $TABLE $RECORDS " in *"./"*)echo nice try;exit;;esac
[ ! "$1" ] || [ "$1" == "--help" ] || [ "$1" == "-h" ] && usage
$@

_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send private message 
Flash
Official Dog Handler


Joined: 04 May 2005
Posts: 10936
Location: Arizona USA

PostPosted: Mon 13 May 2013, 20:40    Post subject:  

Is this a Relational Dabase, or just a plain old Database? Is a database related to a Content-Addressable Memory?
Back to top
View user's profile Send private message 
jpeps

Joined: 31 May 2008
Posts: 3220

PostPosted: Mon 13 May 2013, 21:05    Post subject:  

seaside wrote:
technosaurus,

What? No awk.

I thought it was your favorite Very Happy

Look Mum! No database!

From someone who's been from spreadsheet to database
and back....

Regards,
s


The point of using a database is the ability to perform complex joins from multiple tables without having to go through all the gymnastics noted in the link. Of course, someone would first have to learn how to use a database.

Examples:

http://zetcode.com/db/sqlite/joins/
Back to top
View user's profile Send private message 
technosaurus


Joined: 18 May 2008
Posts: 4280

PostPosted: Mon 13 May 2013, 21:17    Post subject:  

Flash wrote:
Is this a Relational Dabase, or just a plain old Database? Is a database related to a Content-Addressable Memory?


Well it already has support for multiple databases with multiple tables containing multiple records with multiple data fields. All that is required to make it relational is to add a function that source another record like:
. DB/Table/record
(pretty simple to do once I get the select function working)
Content-Addressable Memory basically comes free if the filesystem supports it for the FAT - to get to he records (still need to process the records), but it can even store the whole thing in ram if it is mounted on a ramfs or tmpfs

I mentioned inotifyd earlier which could be used as an optional trigger daemon if the database is going to be used by external programs (otherwise triggers can be done internally in a ReSTful way on a per transaction basis)

_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send private message 
tallboy


Joined: 21 Sep 2010
Posts: 440
Location: Oslo, Norway

PostPosted: Tue 14 May 2013, 00:13    Post subject:  

Quote:
I don't know why this has never been done before, but can't seem to find anything similar via google

It has been done.
I have scrutinized several bookmark files back to 2005, but without any luck. I know there was a database project going, that was based on primarily Linux builtins, but also bash commands, but I cannot remember any name, and cannot find any link. I wonder if it was a german or dutch project, used in an educational setting to make students aware of the possibilities that lies under your fingertips, as opposed to a fancy GUI database.
The problem is mainly that the words you use in a Google query to find such a base, are so general, and give so many hits, that it is almost impossible to use!

tallboy

_________________
True freedom is a live Puppy on a multisession CD/DVD.
Back to top
View user's profile Send private message 
jpeps

Joined: 31 May 2008
Posts: 3220

PostPosted: Tue 14 May 2013, 02:07    Post subject:  

tallboy wrote:
Quote:
I don't know why this has never been done before, but can't seem to find anything similar via google

It has been done.
I have scrutinized several bookmark files back to 2005, but without any luck.


Fast forward to 2013: There are a zillion sophisticated GUI apps that utilize databases, and you don't have to read a book on SQL commands to use them.
Sqlite has already found its way onto a few billion devices (none of which use a command line). Developers need to understand the language, and the engines themselves are now extremely well tested and reliable (think BIG DATA).
Back to top
View user's profile Send private message 
technosaurus


Joined: 18 May 2008
Posts: 4280

PostPosted: Tue 14 May 2013, 03:23    Post subject:  

Code:
where(){
#stub, but this was the PITA part (todo add other comparisons and iterate over $@ with shift)
eval val=\$${1%=*}
[ "$val" == "${1#*=}" ] && echo match
}
usage: where COL="some value"

some others related discussions
https://news.ycombinator.com/item?id=5229883
http://webdevrefinery.com/forums/topic/10449-why-are-databases-stored-in-a-single-file/

Note: we can use readahead to get in-memory benefits

Another advantage symlinks to get almost free extra indices but put them in hidden directories with .NAME
also should store large binaries in a hidden .BINARY directory

_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send private message 
amigo

Joined: 02 Apr 2007
Posts: 2221

PostPosted: Tue 14 May 2013, 05:39    Post subject:  

When you first started the thread I thought you had in mind to use no files at all -that entries would only consist of directories and/or links.

Using flat files to hold data becomes a space problem since every file has a minimum size even if it only contains one character. The block size of the filesystem determines the minimum size requirement -usually at least 4k.

Since the filesystem is _itself_ a database, the structures are already there -in the metadata. But the problem becomes handling very long entries which could exceed the allowable amount permitted for directory names; characters which are not allowed in dirnames, and maximum number of directory entries allowed by the filesystem.

I know of an implementation of this idea, where lists of files and/or directories must be compared. Each item gets represented as a directory -no real data is ever written.

Since the size allotted for the metadata of each item is constant, the filesystem will always show as being virtually empty. Using 'ls -l' or 'stat' will tell you everything you need to know.

Not long ago I found a shell-based database and mentioned it in a database thread here -I can't find the name at the moment, though...
Back to top
View user's profile Send private message 
Display posts from previous:   Sort by:   
Page 1 of 2 [19 Posts]   Goto page: 1, 2 Next
Post new topic   Reply to topic View previous topic :: View next topic
 Forum index » Off-Topic Area » Programming
Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.1232s ][ Queries: 12 (0.0073s) ][ GZIP on ]