scheme shell
about
download
support
resources
docu
links
 
scsh.net
From: gasbichl@informatik.uni-tuebingen.de (Martin Gasbichler)
Newsgroups: comp.lang.scheme.scsh
Subject: Re: Request for Comment/Developping - Splitting file/bookmarks in
 directories (gnus-like)
Message-ID: <vya4rk4fvgy.fsf@aubisque.informatik.uni-tuebingen.de>
Date: Tue, 26 Feb 2002 10:22:53 +0100

>>>>> "Samir" == Samir Saidani <saidani@info.unicaen.fr> writes:

Samir> Hello,

Hi Samir,

I've  added a few comments below. Unfortunately, Olin is currently busy
and not very likely to respond - he is designed to be a script reviewer.


Samir> I put here some codes written in few hours. It's self-documented. I'm
Samir> a newbie (please skish hackers, a good tutorial would be very
Samir> convenient !). I've discovering this wonderful shell two weeks

Did you look at the Howto and Resources sections of www.scsh.net?

Samir> ago. Please, skish wizards, take a look on this code. I would like to
Samir> improve it, and add many features : is someone interested ?

Samir> Thanks a lot

Samir> ---------------------------------
Samir> ; REQUEST FOR DEVELOPPING
Samir> ; The general problem is to control (canalize) the dataflow coming
Samir> ; from Internet : downloading files, and bookmarking sites
Samir> ; This script should be considered as a "bootstrap", to resolve this problem
Samir> ; This script would like to be ready to debug/use
Samir> ; saidani@info.unicaen.fr

Samir> ; I use Emacs Scheme mode
Samir> ; M-x compile
Samir> ; use "scsh -s file-split.scm" command

Samir> ;; my User File Hierarchy Standard (UFHS ;-)
Samir> ;; I use a usenet-like hierarchy to organize my directory
Samir> ; comp
Samir> ; comp/lang
Samir> ; comp/lang/smalltalk
Samir> ; comp/lang/lisp
Samir> ; ...
Samir> ; misc
Samir> ; tmp
Samir> ; when I download a file about lisp, I prefix it with "lisp-"
Samir> ; So splitting process could be more easier


Samir> ;; todo : permit the ~ symbol...
Samir> (define file-split-inbox '("/home/saidani/sandbox"))

Samir> ;; example : lisp-gw0012.ps (I usually add a prefix)
Samir> ;; ext : extension
Samir> ;; base : basename
Samir> ;; prefix: prefix
Samir> ;; all : take a regexp
Samir> ;; I try to stay close to gnus splitting way.

Samir> ;; todo: just write the "articles" instead the whole path.

Samir> (define file-split-rules '(
Samir> (ext "mp3" "/home/saidani/sandbox/mp3")
Samir> (ext "zip" "/home/saidani/sandbox/zip")
Samir> (base "article" "/home/saidani/sandbox/articles")))

I would recommend using records here to save you from "caddaar"
excesses. You can also use finite record types for the parts of the
filename (as I will do below) to avoid string comparing and typos but
maybe symbols are sufficient. In any case I would not recommend to use
strings.

See the S48 manual for documentation about these two topics.

Add -o define-record-types -o finite-types to the call to scsh, then

(define-enumerated-type filename-part :filename-part
  filename-part?
  filename-parts
  filename-part-name
  filename-part-index
  (ext base prefix all))

(define-record-type file-split-rule :file-split-rule
   (make-file-split-rule fname-part value dir)
    file-split-rule?
    (fname-part file-split-rule-fname-part)
    (value file-split-rule-value)
    (dir file-split-rule-dir))

Now you can rewrite the example as

(define file-split-rules
  (list (make-file-split-rule (filename-part ext) 
			      "mp3"
			      "/home/saidani/sandbox/mp3")
	etc))


Samir> ;; usecase : (file-extension? "music.mp" "mp3") returns #f
Samir> ;; test ok
Samir> (define (file-extension? fname ext)
Samir>   (if (regexp-search? (rx ,ext) (file-name-extension fname)) #t #f)) 

Small hint: The IF is not necessary here ;-)

Samir> ;; usecase : (file-basename? "article-scheme.ps" "article") returns #t
Samir> ;; test ok
Samir> (define (file-basename? fname base)
Samir>   (if (regexp-search? (rx ,base) (file-name-sans-extension fname)) #t #f))

Samir> ;; Return nil if not match, else return a list of possible destination
Samir> ;; Problem, it returns '( '() '() ...)... hmm. Dirty, no ?
Samir> (define (file-split-match fname)
Samir>   (map
>> (lambda (rules)
>> (cond ((equal? (car rules) 'ext) 
Samir> 	  (if (file-extension? fname (cadr rules)) (caddr rules) '()))
>> 
Samir>  	 ((equal? (car rules) 'base) 
Samir>  	  (if (file-basename? fname (cadr rules)) (caddr rules) '()))
>> ))
Samir>  file-split-rules))

This becomes 
(define (file-split-match fname)
  (map
   (lambda (rule) ;; rules is not appropriate here
     (cond ((eq? (file-split-rule-fname-part rule) (filename-part ext)) 
	    (if (file-extension? fname (file-split-rule-value rule)) (list (file-split-rule-dir rule)) '()))
 
	   dto
	   ))
   file-split-rules))

You can wrap

(apply append ...) 

around the call to MAP to get a list of directories.

Samir> ; Take the first matching destination
Samir> ; todo: if the dest doesn't exist, create it.
Samir> ; no move, for the moment, only debugging
Samir> ; I work on a sandbox directory
Samir> ; ls give : article article-azerty.ps coucou bonjour mp3/ mp3file mus.mp3 file-split.scm
Samir> (define (move-file src dest)
Samir>   (if (not (null? dest)) (format #t "move ~a to ~a~%" src (car dest)))) 

Samir> ; Splitting process according split-rules

Samir> (with-cwd (car file-split-inbox)
Samir> 	  (for-each 
Samir> 	   (lambda (fname)
Samir> 	     (format #t "file :~a~%"  fname) ; debug information
Samir> 	     (move-file fname (delete '() (file-split-match fname))))
>> (directory-files)))
Samir> ------------------------

Samir> (define bookmark-split-inbox '("/home/saidani/sandbox/bookmarks.html"))

Samir> ; First, purify inbox : html->list
Samir> ; a list of anchor and title '( (anchor title) ...)
Samir> ; we want a new list and add directory 

Samir> ; example '( ("http://www.test.fr" "Lisp User Group") ("http://www.retest.fr" "Skish"))
Samir> ; and output '( ("comp/lang/lisp" 
Samir> ;;                 ("http://www.test.fr" "Lisp User Group") ("http://www.retest.fr" "Skish"))...)

Samir> ; Last, transform list->html


Samir> (define book (run/strings (cat bookmarks.html)))

Forking a process is way too expensive for such a small task, better
do this in scsh:

(define (file->string-list filename)
  (let ((port (open-input-file filename)))
    (let lp ()
      (let ((line-or-eof (read-line port)))
        (if (eof-object? line-or-eof)
            '()
            (cons line-or-eof (lp)))))))

(I also added this to the Code Snippets Wiki)

(define book (file->string-list "bookmarks.html"))


Samir> (define test (nth book 10))

Samir> (define foo (match:substring (regexp-search (rx (: "file://" (* any) "html" )) test)))
Samir> (display foo)
Samir> (newline)

Later, you may consider using an HTML library for processing the bookmark file.

-- 
Martin

Up