Lisp HUG Maillist Archive

Pathname vs Make-pathname?

Hi,

I was just sitting down to do some pathname matching on directory scans and noticed a persistent problem on the Mac with .DS_Store files peppered all over the place. And despite my filter, I could not see them removed from my lists.

I traced it down to the following: My filter is a list containing the result of (PATHNAME “.DS_Store”), while I was comparing against the incoming pathname with 

(pathname-match-p
	(make-pathname  ;; compare sans directory
		:name (pathname-name path)
		:type   (pathname-type path))
	(pathname “.DS_Store”))

The result is NIL (!!). Looking at the inspected components, (PATHNAME “.DS_Store”) shows a SYSTEM::NAME component of “” (the empty string). While make-pathname produces a SYSTEM::NAME component of NIL. Hence the comparison failure.

Is this a bug?

- DM

_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Re: Pathname vs Make-pathname?

I take this to be asking why

(pathname-name (make-pathname :name "”))

returns NIL. This happens in Lispworks on both MacOS and Windows, but not in Allegro or SBCL.



On Sep 8, 2016, at 1:59 PM, David McClain <dbm@refined-audiometrics.com> wrote:


Hi,

I was just sitting down to do some pathname matching on directory scans and noticed a persistent problem on the Mac with .DS_Store files peppered all over the place. And despite my filter, I could not see them removed from my lists.

I traced it down to the following: My filter is a list containing the result of (PATHNAME “.DS_Store”), while I was comparing against the incoming pathname with

(pathname-match-p
(make-pathname  ;; compare sans directory
:name (pathname-name path)
:type   (pathname-type path))
(pathname “.DS_Store”))

The result is NIL (!!). Looking at the inspected components, (PATHNAME “.DS_Store”) shows a SYSTEM::NAME component of “” (the empty string). While make-pathname produces a SYSTEM::NAME component of NIL. Hence the comparison failure.

Is this a bug?

- DM

_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.lispworks.com_support_lisp-2Dhug.html&d=CwIDaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=rYww9Ur1WoBi28j2XhN4rwFLo31INs66FdFiX_Ro7pI&m=dWJaSWXHm1eAxitKFlYXgorJ0uWwAdES7jSy3cerhpY&s=eOpyYTiGfJ9wveKx-l_Vz48Auc0x0zGa5BeTEfIfbzw&e=


------------------
Christopher Riesbeck
Home page: http://www.cs.northwestern.edu/~riesbeck
Calendar: http://www.cs.northwestern.edu/~riesbeck/calendar.html


Re: Pathname vs Make-pathname?

Unable to parse email body. Email id is 14046

Re: Pathname vs Make-pathname?


On Sep 9, 2016, at 12:17 AM, Madhu <enometh@meer.net> wrote:


* Chris Riesbeck Wrote on Thu, 8 Sep 2016 19:33:44 +0000:

| I take this to be asking why
|
| (pathname-name (make-pathname :name ""))
|
| returns NIL. This happens in Lispworks on both MacOS and Windows, but
| not in Allegro or SBCL.

But notice that (pathname-name (pathname "")) is also NIL

I don’t see the relevance of this example. Presumably (PATHNAME “”) calls (MAKE-PATHNAME) or the internal equivalent. NILs everywhere are to be expected.

The issue here is what does (MAKE-PATHNAME :NAME  “”) do.


No, I do not think David is asking this.  Phrasing the question this
way may confuse the issue, which about PARSE-NAMESTRING.

The question is about (pathname ".DS_Store")

".DS_Store" is ambiguous. How is it to be parsed?

It’s certainly implementation-defined but behavior should be consistent within one implementation.

[snipping]


But the question remains why does LW's MAKE-PATHNAME use NIL to
represent an empty name component instead of an empty string (or
alternatively why does PARSE-NAMESTRING an empty string instead of
NIL)

(pathname-name (make-pathname :name "" :type "DS_Store")) ;=> NIL
(pathname-name (pathname ".DS_Store")) ;=> “"

I agree this is the question, but it’s not just PATHNAME and PARSE-NAMESTRING that do this. Most critically for David’s case is that DIRECTORY returns file objects with “” names. Only MAKE-PATHNAME seems unable to create empty string file names.


My guess is NIL is a better choice when merging.


Better why? It’s different. Where would this be better than in Allegro and SBCL, where (MAKE-PATHNAME :NAME “”)  puts “” in the name field?


 It is also
potentially a wildcard for PATHNAME-MATCH-P with benefits precisely in
use cases such as David's case here!

(pathname-match-p "/dev/shm/.DS_Store"  ".DS_Store") ; => T

I’m confused. How does this show the benefit of MAKE-PATHNAME’s behavior? There’s probably no MAKE-PATHNAME call here, just implicit PATHNAME calls on both arguments. We already know PATHNAME will generate “” names.


The second argument to PATHNAME-MATCH-P is always a constant.  You
should be using MAKE-PATHNAME here instead of the first argument.

(pathname-match-p path ".DS_Store") already works

Yes, but precisely because it avoids MAKE-PATHNAME

or

(pathname-match-p path (make-pathname :name nil :type "DS_Store"))

would work in LispWorks

This is not portable. It would not work in Allegro or SBLC where the correct wildcard is (MAKE-PATHNAME :name “.DS_Store”)


Comparing PATHNAMES is usually problematic because it depends on who is
creating the pathnames.  I remember in 2013 I had a problem with
DIRECTORY filling in some pathname components as :UNSPECIFIC. Martin
pointed out to me that the most reliable way to compare pathnames is
with namestrings, there are incompatibilities in how other lisps quote
characters in creating lisp namestring strings for files on the same
operating system

— Madhu

We certainly agree here.

But I still would like to see Lispworks explain the rationale for why MAKE-PATHNAME behaves the way it does, in the context of how they chose to make pathname parsing work with dot-files.


------------------
Christopher Riesbeck
Home page: http://www.cs.northwestern.edu/~riesbeck
Calendar: http://www.cs.northwestern.edu/~riesbeck/calendar.html


Re: Pathname vs Make-pathname?

Christopher K Riesbeck <c-riesbeck@northwestern.edu> writes:

> I take this to be asking why
>
> (pathname-name (make-pathname :name "”))
>
> returns NIL. This happens in Lispworks on both MacOS and Windows, but
> not in Allegro or SBCL.

Cf. 19.2.2.5:

    * The host, device, directory, name, and type can be strings. There
      are implementation-dependent limits on the number and type of
      characters in these strings.

Therefore you should be very conservative on the length of the strings
you pass to make-pathname.  An empty string is probably not a valid
pathname component on at least some implementations.  A string longuer
than 5 or 7 for the name component is probably out of bounds on some
implementations too.

For the type you will have even stronger limitations.


Furthermore:

    19.3.2.2 Null Strings as Components of a Logical Pathname

    The null string, "", is not a valid value for any component of a
    logical pathname.



So since I would advice to always use logical pathnames, an empty string
is definitely something you should not use.


-- 
__Pascal Bourguignon__                 http://www.informatimago.com/
“The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.” -- Carl Bass CEO Autodesk

_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Re: Pathname vs Make-pathname?

On 9/10/16 Sep 10 -2:31 PM, Pascal J. Bourguignon wrote:
> 
> Christopher K Riesbeck <c-riesbeck@northwestern.edu> writes:
> 
>> I take this to be asking why
>>
>> (pathname-name (make-pathname :name "”))
>>
>> returns NIL. This happens in Lispworks on both MacOS and Windows, but
>> not in Allegro or SBCL.
> 
> Cf. 19.2.2.5:
> 
>     * The host, device, directory, name, and type can be strings. There
>       are implementation-dependent limits on the number and type of
>       characters in these strings.
> 
> Therefore you should be very conservative on the length of the strings
> you pass to make-pathname.  An empty string is probably not a valid
> pathname component on at least some implementations.  A string longuer
> than 5 or 7 for the name component is probably out of bounds on some
> implementations too.
> 
> For the type you will have even stronger limitations.
> 
> 
> Furthermore:
> 
>     19.3.2.2 Null Strings as Components of a Logical Pathname
> 
>     The null string, "", is not a valid value for any component of a
>     logical pathname.
> 
> 
> 
> So since I would advice to always use logical pathnames, an empty string
> is definitely something you should not use.
> 
> 

I'm surprised to hear this advice.  Our experience in ASDF has been
precisely the opposite: because of their limitations, we *avoid* logical
pathnames (although ASDF does support them).  Their behavior with
commonly used pathname constituents (mixed-case, underscores) is either
failure, or inconsistent behavior across implementations.  E.g., if you
provide a system that will load files relative to a logical pathname in
your configuration, and a user enters a pathname containing an
underscore.... bad things will happen.

Logical pathnames were a reasonable approach for a time when filesystems
(and operating systems) were much more diverse than they are today.
Today they are simply too constraining.

One of the challenges Faré addressed in UIOP was to make file system
interaction as consistent as possible across platforms (and even so,
results are still inconsistent).

Best,
R

_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Re: Pathname vs Make-pathname?


On Sep 10, 2016, at 2:31 PM, Pascal J. Bourguignon <pjb@informatimago.com> wrote:


Christopher K Riesbeck <c-riesbeck@northwestern.edu> writes:

I take this to be asking why

(pathname-name (make-pathname :name "”))

returns NIL. This happens in Lispworks on both MacOS and Windows, but
not in Allegro or SBCL.

Cf. 19.2.2.5:

   * The host, device, directory, name, and type can be strings. There
     are implementation-dependent limits on the number and type of
     characters in these strings.

Therefore you should be very conservative on the length of the strings
you pass to make-pathname.  An empty string is probably not a valid
pathname component on at least some implementations.  A string longuer
than 5 or 7 for the name component is probably out of bounds on some
implementations too.

Sure. But David wasn’t generating the empty string names to begin with. Lispworks was, with DIRECTORY, and also it turns out PATHNAME and PARSE-NAMESTRING.  

No problem would have arisen if Lispworks’ DIRECTORY had return an object for “.DS_Store” with NIL for the name in the first place.

It’s this apparently inconsistent behavior that caused the problem. 

------------------
Christopher Riesbeck
Home page: http://www.cs.northwestern.edu/~riesbeck
Calendar: http://www.cs.northwestern.edu/~riesbeck/calendar.html


Re: Pathname vs Make-pathname?

David McClain <dbm@refined-audiometrics.com> writes:

> Hi Pascal,
>
> Logical pathnames are a huge inconvenience, at best, and perhaps
> impossible in general. I’m scanning arbitrary directories across
> networks of LAN machines.
>
> But it turns out that just using plain strings in pathname-match-p
> works just fine after all. It was my mistake, early on, thinking that
> I needed a pathname as the second argument to pathname-match-p.


Indeed.


The set of POSIX paths representable as Common Lisp STRING is a strict
subset of the set of all POSIX paths.

The set of POSIX paths representable as physical pathnames is a subset of
the set of POSIX paths representable as Common Lisp STRING

The set of POSIX paths representable as logical pathname is a subset of
the set of POSIX paths representable as physical pathnames.


So if you want to work with POSIX paths, ALL of them, you cannot even
use Common Lisp strings!!!

You will have to use vectors of octets (excluding 0 and 47).


If you try to use Common Lisp STRING to represent POSIX paths, then you
will have a lot of problems with character encodings. 

If you're happy with those encoding problems, then go ahead, use Common
Lisp STRING, this will cover a lot of paths, notably if you avoid server
sharing files with heterogeneous systems, if you avoid mounting file
systems coming from heterogeneous systems, and if you avoid having
multiple users using different languages and different terminals on your
POSIX system.

(Granted, nowadays, a lot of computer systems match those conditions, so
it's a little less probable to find such encoding errors than earlier).


Using logical pathnames or physical pathnames (eg. as returned by
DIRECTORY), implies indeed that you will be able to access only a subset
of the paths in a POSIX system.  Here lies the sanity when using
PATHNAMEs!


Otherwise, use vectors of octets, but I'm afraid, few CL POSIX library
provide API using this type for "char*" data…


>> On Sep 10, 2016, at 12:31, Pascal J. Bourguignon <pjb@informatimago.com> wrote:
>> 
>> 
>> Christopher K Riesbeck <c-riesbeck@northwestern.edu> writes:
>> 
>>> I take this to be asking why
>>> 
>>> (pathname-name (make-pathname :name "”))
>>> 
>>> returns NIL. This happens in Lispworks on both MacOS and Windows, but
>>> not in Allegro or SBCL.
>> 
>> Cf. 19.2.2.5:
>> 
>>    * The host, device, directory, name, and type can be strings. There
>>      are implementation-dependent limits on the number and type of
>>      characters in these strings.
>> 
>> Therefore you should be very conservative on the length of the strings
>> you pass to make-pathname.  An empty string is probably not a valid
>> pathname component on at least some implementations.  A string longuer
>> than 5 or 7 for the name component is probably out of bounds on some
>> implementations too.
>> 
>> For the type you will have even stronger limitations.
>> 
>> 
>> Furthermore:
>> 
>>    19.3.2.2 Null Strings as Components of a Logical Pathname
>> 
>>    The null string, "", is not a valid value for any component of a
>>    logical pathname.
>> 
>> 
>> 
>> So since I would advice to always use logical pathnames, an empty string
>> is definitely something you should not use.
>> 
>> 
>> -- 
>> __Pascal Bourguignon__                 http://www.informatimago.com/

-- 
__Pascal Bourguignon__                 http://www.informatimago.com/
“The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.” -- Carl Bass CEO Autodesk

_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Re: Pathname vs Make-pathname?

Robert Goldman <rpgoldman@sift.net> writes:

> On 9/10/16 Sep 10 -2:31 PM, Pascal J. Bourguignon wrote:
>> 
>> Christopher K Riesbeck <c-riesbeck@northwestern.edu> writes:
>> 
>>> I take this to be asking why
>>>
>>> (pathname-name (make-pathname :name "”))
>>>
>>> returns NIL. This happens in Lispworks on both MacOS and Windows, but
>>> not in Allegro or SBCL.
>> 
>> Cf. 19.2.2.5:
>> 
>>     * The host, device, directory, name, and type can be strings. There
>>       are implementation-dependent limits on the number and type of
>>       characters in these strings.
>> 
>> Therefore you should be very conservative on the length of the strings
>> you pass to make-pathname.  An empty string is probably not a valid
>> pathname component on at least some implementations.  A string longuer
>> than 5 or 7 for the name component is probably out of bounds on some
>> implementations too.
>> 
>> For the type you will have even stronger limitations.
>> 
>> 
>> Furthermore:
>> 
>>     19.3.2.2 Null Strings as Components of a Logical Pathname
>> 
>>     The null string, "", is not a valid value for any component of a
>>     logical pathname.
>> 
>> 
>> 
>> So since I would advice to always use logical pathnames, an empty string
>> is definitely something you should not use.
>> 
>> 
>
> I'm surprised to hear this advice.  Our experience in ASDF has been
> precisely the opposite: because of their limitations, we *avoid* logical
> pathnames (although ASDF does support them).  Their behavior with
> commonly used pathname constituents (mixed-case, underscores) is either
> failure, or inconsistent behavior across implementations.


Logical pathname components cannot have mixed-case or underscores, etc.
The only valid (conforming) characters in logical pathname components
are A-Z 0-9, #\. and #\-.
cf. 19.3.1: word---one or more uppercase letters, digits, and hyphens.

And indeed, since physical pathnames are purely implementation
dependant, the "automatic" mapping between logical pathnames and
physical pathnames is purely implementation dependent.

The only conforming thing you can do here, is starting from a list of
physical pathnames, to build logical pathnames, establishing an explicit
mapping thru logical pathname translations (therefore with absolutely no
wild card).

When you do that, everything is bliss! :-)


> E.g., if you
> provide a system that will load files relative to a logical pathname in
> your configuration, and a user enters a pathname containing an
> underscore.... bad things will happen.

Of course.  Why would you accept an underscore?  Only uppercase letters,
digits, dot and dash.


> Logical pathnames were a reasonable approach for a time when filesystems
> (and operating systems) were much more diverse than they are today.
> Today they are simply too constraining.

While logical pathnames might be overkill to deal with POSIX paths,
neither physical pathnames nor CL STRING are adapted to them.  The
semantics of POSIX paths, and file system mounting is such that the only
safe way to deal with POSIX paths, is to deal with them as what they
are: vectors of octets (excluding 0, 47 being the component separator).


> One of the challenges Faré addressed in UIOP was to make file system
> interaction as consistent as possible across platforms (and even so,
> results are still inconsistent).

And then, MS-Windows and MS-DOS paths are not POSIX paths since instead
of using 47, they use 92, and there's this device prefix (that while it
could be mapped to CL pathname-device, is often not, in anycase it's
implementation dependent).

Notice also that MacOSX also deals with MacOS paths (components
separated with ":", but I'm not sure if the unix layer does, or if it's
only in the Cocoa layer and the user interface).

In any case, there's still space for logical pathnames in the
contemporaneous environment.


(For example, iolib/syscall which should know better, converts vectors
of octets into lisp strings using a utf-8 encoding (escaping with a
#\Nul the octets that are not valid utf-8 sequences).  What about file
systems using iso-8859-1, windows-1250, or cp830?
You may get botched components.  )

-- 
__Pascal Bourguignon__                 http://www.informatimago.com/
“The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.” -- Carl Bass CEO Autodesk

_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Updated at: 2020-12-10 08:31 UTC