From: gasbichl@informatik.uni-tuebingen.de (Martin Gasbichler)
Newsgroups: comp.lang.scheme.scsh
Subject: Re: Request for Comment/Developping - Splitting file/bookmarks in
directories (gnus-like)
Message-ID: <vya4rk4fvgy.fsf@aubisque.informatik.uni-tuebingen.de>
Date: Tue, 26 Feb 2002 10:22:53 +0100
>>>>> "Samir" == Samir Saidani <saidani@info.unicaen.fr> writes:
Samir> Hello,
Hi Samir,
I've added a few comments below. Unfortunately, Olin is currently busy
and not very likely to respond - he is designed to be a script reviewer.
Samir> I put here some codes written in few hours. It's self-documented. I'm
Samir> a newbie (please skish hackers, a good tutorial would be very
Samir> convenient !). I've discovering this wonderful shell two weeks
Did you look at the Howto and Resources sections of www.scsh.net?
Samir> ago. Please, skish wizards, take a look on this code. I would like to
Samir> improve it, and add many features : is someone interested ?
Samir> Thanks a lot
Samir> ---------------------------------
Samir> ; REQUEST FOR DEVELOPPING
Samir> ; The general problem is to control (canalize) the dataflow coming
Samir> ; from Internet : downloading files, and bookmarking sites
Samir> ; This script should be considered as a "bootstrap", to resolve this problem
Samir> ; This script would like to be ready to debug/use
Samir> ; saidani@info.unicaen.fr
Samir> ; I use Emacs Scheme mode
Samir> ; M-x compile
Samir> ; use "scsh -s file-split.scm" command
Samir> ;; my User File Hierarchy Standard (UFHS ;-)
Samir> ;; I use a usenet-like hierarchy to organize my directory
Samir> ; comp
Samir> ; comp/lang
Samir> ; comp/lang/smalltalk
Samir> ; comp/lang/lisp
Samir> ; ...
Samir> ; misc
Samir> ; tmp
Samir> ; when I download a file about lisp, I prefix it with "lisp-"
Samir> ; So splitting process could be more easier
Samir> ;; todo : permit the ~ symbol...
Samir> (define file-split-inbox '("/home/saidani/sandbox"))
Samir> ;; example : lisp-gw0012.ps (I usually add a prefix)
Samir> ;; ext : extension
Samir> ;; base : basename
Samir> ;; prefix: prefix
Samir> ;; all : take a regexp
Samir> ;; I try to stay close to gnus splitting way.
Samir> ;; todo: just write the "articles" instead the whole path.
Samir> (define file-split-rules '(
Samir> (ext "mp3" "/home/saidani/sandbox/mp3")
Samir> (ext "zip" "/home/saidani/sandbox/zip")
Samir> (base "article" "/home/saidani/sandbox/articles")))
I would recommend using records here to save you from "caddaar"
excesses. You can also use finite record types for the parts of the
filename (as I will do below) to avoid string comparing and typos but
maybe symbols are sufficient. In any case I would not recommend to use
strings.
See the S48 manual for documentation about these two topics.
Add -o define-record-types -o finite-types to the call to scsh, then
(define-enumerated-type filename-part :filename-part
filename-part?
filename-parts
filename-part-name
filename-part-index
(ext base prefix all))
(define-record-type file-split-rule :file-split-rule
(make-file-split-rule fname-part value dir)
file-split-rule?
(fname-part file-split-rule-fname-part)
(value file-split-rule-value)
(dir file-split-rule-dir))
Now you can rewrite the example as
(define file-split-rules
(list (make-file-split-rule (filename-part ext)
"mp3"
"/home/saidani/sandbox/mp3")
etc))
Samir> ;; usecase : (file-extension? "music.mp" "mp3") returns #f
Samir> ;; test ok
Samir> (define (file-extension? fname ext)
Samir> (if (regexp-search? (rx ,ext) (file-name-extension fname)) #t #f))
Small hint: The IF is not necessary here ;-)
Samir> ;; usecase : (file-basename? "article-scheme.ps" "article") returns #t
Samir> ;; test ok
Samir> (define (file-basename? fname base)
Samir> (if (regexp-search? (rx ,base) (file-name-sans-extension fname)) #t #f))
Samir> ;; Return nil if not match, else return a list of possible destination
Samir> ;; Problem, it returns '( '() '() ...)... hmm. Dirty, no ?
Samir> (define (file-split-match fname)
Samir> (map
>> (lambda (rules)
>> (cond ((equal? (car rules) 'ext)
Samir> (if (file-extension? fname (cadr rules)) (caddr rules) '()))
>>
Samir> ((equal? (car rules) 'base)
Samir> (if (file-basename? fname (cadr rules)) (caddr rules) '()))
>> ))
Samir> file-split-rules))
This becomes
(define (file-split-match fname)
(map
(lambda (rule) ;; rules is not appropriate here
(cond ((eq? (file-split-rule-fname-part rule) (filename-part ext))
(if (file-extension? fname (file-split-rule-value rule)) (list (file-split-rule-dir rule)) '()))
dto
))
file-split-rules))
You can wrap
(apply append ...)
around the call to MAP to get a list of directories.
Samir> ; Take the first matching destination
Samir> ; todo: if the dest doesn't exist, create it.
Samir> ; no move, for the moment, only debugging
Samir> ; I work on a sandbox directory
Samir> ; ls give : article article-azerty.ps coucou bonjour mp3/ mp3file mus.mp3 file-split.scm
Samir> (define (move-file src dest)
Samir> (if (not (null? dest)) (format #t "move ~a to ~a~%" src (car dest))))
Samir> ; Splitting process according split-rules
Samir> (with-cwd (car file-split-inbox)
Samir> (for-each
Samir> (lambda (fname)
Samir> (format #t "file :~a~%" fname) ; debug information
Samir> (move-file fname (delete '() (file-split-match fname))))
>> (directory-files)))
Samir> ------------------------
Samir> (define bookmark-split-inbox '("/home/saidani/sandbox/bookmarks.html"))
Samir> ; First, purify inbox : html->list
Samir> ; a list of anchor and title '( (anchor title) ...)
Samir> ; we want a new list and add directory
Samir> ; example '( ("http://www.test.fr" "Lisp User Group") ("http://www.retest.fr" "Skish"))
Samir> ; and output '( ("comp/lang/lisp"
Samir> ;; ("http://www.test.fr" "Lisp User Group") ("http://www.retest.fr" "Skish"))...)
Samir> ; Last, transform list->html
Samir> (define book (run/strings (cat bookmarks.html)))
Forking a process is way too expensive for such a small task, better
do this in scsh:
(define (file->string-list filename)
(let ((port (open-input-file filename)))
(let lp ()
(let ((line-or-eof (read-line port)))
(if (eof-object? line-or-eof)
'()
(cons line-or-eof (lp)))))))
(I also added this to the Code Snippets Wiki)
(define book (file->string-list "bookmarks.html"))
Samir> (define test (nth book 10))
Samir> (define foo (match:substring (regexp-search (rx (: "file://" (* any) "html" )) test)))
Samir> (display foo)
Samir> (newline)
Later, you may consider using an HTML library for processing the bookmark file.
--
Martin
Up |