diff options
author | Alexander Larsson <alexl@src.gnome.org> | 2007-09-13 07:47:11 +0000 |
---|---|---|
committer | Alexander Larsson <alexl@src.gnome.org> | 2007-09-13 07:47:11 +0000 |
commit | 4ffa0e576a67189362214d5b57c5e1356de41e8b (patch) | |
tree | c949c7af7e562cdff3b8b7b532357b51d90e2424 /txt | |
parent | 9415407715053833af296e03440e3760fd3e75ac (diff) | |
download | gvfs-4ffa0e576a67189362214d5b57c5e1356de41e8b.tar.gz |
Initial revision
Original git commit by alex <alex> at 1159272204 +0000
svn path=/trunk/; revision=3
Diffstat (limited to 'txt')
-rw-r--r-- | txt/ops.txt | 137 | ||||
-rw-r--r-- | txt/vfs-ideas.txt | 425 | ||||
-rw-r--r-- | txt/vfs-names.txt | 142 |
3 files changed, 704 insertions, 0 deletions
diff --git a/txt/ops.txt b/txt/ops.txt new file mode 100644 index 00000000..bf20647c --- /dev/null +++ b/txt/ops.txt @@ -0,0 +1,137 @@ + + +type: File, Folder, Symlink, Shortcut, Mountable, special (fifo, socket, chardec, blockdev) +flags: hidden, +GFileInfo { + type get_type() + char *get_name() + char *get_display_name() + char *get_icon() /* string? what about win32, remote icons etc */ + + gint64 get_file_size() + char *get_mime_type() + char *get_link_target() + can_read()/write()/delete()/rename()/maybe: move()/copy() + flags get_flags() + time_t get_modification_time() + gboolean get_unix_stat () + char *get_attribute() + char **get_attributes(char *namespace) + char **get_all_attributes() /* form namespace:attrname -> string */ +} + +GFSInfo { + char *get_fs_type() + gint64 get_free_space() + gint64 get_total_space() +} + +GFile *g_file_for_path (char *path) +GFile *g_file_for_uri (char *uri) +GFile *g_file_parse_display_name (char *display_name) + +GFile { + char *get_path() + is_native() => is_file:/// + + char *get_uri (); + char *get_absolute_display_name () + + set_keep_open(boolean keep_open) + + GFile *get_parent () + GFile *get_child (char *name) + GFileEnumerator *enumerate_children(flags, attributes...) + GFileInfo *get_info (flags, attributes...) + + void reload() + GInputStream *read() + + GOutputStream *append_to() /* optional (not on webdav) */ + GOutputStream *create() + GSaveStream *replace(mtime, backup_name, ) +/* permissions are all set minus umask, except replace which + saves old permissions */ + +/* ?? */ + GFile *resolve_symlink(char *symlink_target); + +/* output ops */ + write/save + rename + move + copy + delete + mkdir + rmkdir + display name -> filename (for new files) + set attrs + + /* other ops: */ + monitor(flags) + signals + mount/unmount + list volumes + + Maybe: + GFile *new_from_uri(path, flags) (file:/// uris) + +} + + +names: + URIs == raw filename (no encoding), all escaped + We generate display absolute paths as filenames if possible, otherwise + as IRIs. This means we can display nice URIs for native utf8 backends + and filenames. However, URIs for non-utf8 shares will look bad. If we know + the encoding we can still get nice non-absolut display names though. + + In client we store names as mountpoint + non-escaped no-encoding string. + Non-uri display name handling done in daemon + +GStatable iface for fstat() support +GSaveStream, with get_final_file_info() + +open for writing: + +append vs truncate +fail on existing or replace + +mtime match +mtime return +backup (suffix+prefix) +create filename from display name +unique name +keep inode or be atomic? + +filename_for_display_name() +write_append() /* optional (not on webdav) */ +write_new() +write_replace() + + + + +ftp supports: + overwrite + append + generate unique name + +http+webdav supports: + overwrite + append in recent versions + get mtime, length, mimetype, atime on read open + + + + + +async thread work: + +function to run in thread +data to pass to thread +cancel identifier +pass in cancel func + data +way for function to communicate with mainloop (of specific context) +does mainloop notifiers block on ack? + + diff --git a/txt/vfs-ideas.txt b/txt/vfs-ideas.txt new file mode 100644 index 00000000..7b41a064 --- /dev/null +++ b/txt/vfs-ideas.txt @@ -0,0 +1,425 @@ +Subject: Plans for gnome-vfs replacement + +Recently there has been a lot of discussions about the platform and +the correct stacking order and quality of the modules. Gnome-vfs +is a clear problem in this discussion. Having spent the last 4 years +as the gnome-vfs maintainer, and even longer as the primary gnome-vfs +user (in Nautilus) I'm well aware of the problems it has. I think that +we've reached a point where the problems in the gnome-vfs architecture +and its position in the stack are now ranking as one of the most +problematic aspects of the gnome platform, especially considering the +enhancements and quality improvements seen in other parts of the +platform. + +So, I think the time has come for a serious look at what gnome-vfs +could be. I've spent much time last week thinking about the weaknesses +and problems of the current gnome-vfs and possibilities inherent in a +redesign, both having learnt from 7 years of gnome-vfs existance and +the improvements in the platform (both Gnome and surrounding +technologies) since 1999 when it was designed. + +As soon as you spend some time looking at this problem is evident that +to solve the platform ordering issues we really need a clean cut from +the current gnome-vfs. I think the ideal level for a VFS would be in +glib, in a separate library similar to gthread or gobject. That way +gtk+ would be able to integrate with it and all gnome apps would have +access to it, but it wouldn't affect small apps that just wants to use +the glib core. Furthermore, not being libglib lets us use GObjects in +the vfs, which means we can make a more modern API. Of course, this +places quite some limitations on the vfs, especially in terms of +dependencies and how integration with a UI should work. + +Any thoughs on the design of a vfs to replace gnome-vfs must be based +on a solid understanding of the problems with the current system, to +avoid redoing old mistakes and to make sure that we solve all the +known problems of the current system. So, I'm gonna start by +describing what I see as the main architectural and "hard" problems in +gnome-vfs. + +The first, and most often discussed problem is of course +compatibility. The various desktops use different vfs implementations, +so they can't read each others files. Many applications use no VFS +at all and have no way to access files on vfs shares. And the +existence of multiple vfs implementations makes it unlikely that they +will start using one. + +Gnome-vfs has no concept of Display Names, something which is very +useful in a gui based system. In some places we auto-generate virtual +desktop files to get this feature, but that is very hackish and need +support in all apps to understand them. This is also quite closely +related to handling filename charset encoding, another very weak point +in gnome-vfs. The ideal way to reference a file is the actual real +identifer on disk, database, remote share or what have you, as that +can be passed between implementations, used with other access methods, +etc. But to display something useful to the user you really need a +user understandable utf-8 encoded string and a way to map that to/from +the filename. + +There is no support for icons at the level of gnome-vfs. This means +that all users above the vfs must implement it themselves (e.g. in +nautilus and in the file selector). Each implementation have its own +bugs, maintainance load and risk for different behaviour. It also +means that vfs backends cannot supply their own icons, something which +might be very useful for e.g. a network share based on some new fancy +web service. + +The abstraction that gnome-vfs use is very similar to the posix model, +which is problematic in several ways. The posix model matches poorly +with the sort of highlevel operations that gnome applications wants to +do with the vfs. The vfs is not typically used for what I would call +"implementation files", which are things like configuration files, +data files shipped with apps, system files, etc, but rather for what +I'd like to call "user document files". These are the kind of files +you open, save, or download from the internet to look at. Applications +that use these would like highlevel operations that match the kind +of operations you use on them, like read-entire-file, save-file, +copy-file, etc. + +The posix model is also a bad match when implementing gnome-vfs +modules. It requires some features from the implementation that can be +difficult or impossible to implement. For example, its not really +possible to support seeking when writing to a file on a webdav share, +because a webdav put operation is essentially just streaming a copy of +the new file contents. We currently work around this by locally +caching all the file data being written and then sending it all in the +close() call. This is clearly suboptimal for a lot of reasons, like +applications not expecting close() to take a long time, and not +checking its error conditions very closely. + +Another problem is that the posix model doesn't contain explicit +operations for some of the things applications need to do. So instead +applications rely on well known knowledge of the behaviour of posix +to implement these operations. However, such behaviour might not be +guaranteed on some gnome-vfs backends. A common example is the atomic +save operation. A typical way to implement this on posix is to write +to a temporary file in the same directory, and then rename over the +target file, thus guaranteeing an atomic replacement of the file on +disk, or a failure that didn't affect the original file. Of course, +if a gnome-vfs application would use this and the backend was a +webdav share to a subversion repository you would get some really +weird versioning history in the repository for no good reason. If the +backend had its own implementation of the save operation we could get +both optimal behaviour on each backend, and an application API that +doesn't require arcane knowledge of the atomicity of renames. + +One of the most problematic aspects of gnome-vfs is its authentication +framework. The way it works is that you register callbacks to handle +the authentication dialog, and whenever any operation needs to do +authentication these callbacks will be called. The idea is that a +console application would register a set of callbacks that print +prompts on the console, and a Gtk+ application would have a set of +callbacks that displays dialogs. There is a set of standard dialog +based callbacks in libgnomeui that you can install by calling +gnome_authentication_manager_init(). From an initial look this seems +like a reasonable approach, but it turns out that this creates a host +of different problems. + +One problem is how you connect this to the application. A lot of +people are unaware that you have to call gnome_auth_manager_init() to +get authentication dialogs, or don't want to depend on libgnomeui to +do so. So a lot of applications don't work with authentication. Those +who do call it generally have pretty poor integration with the +authentication dialogs. For instance, the general authentication +dialogs can't be marked as parents of whatever dialog caused the +authentication (because they have know whay of knowing what caused +it), and all sorts of problems appear when there is a modal dialog +displayed already. + +Another problem is the combination of blocking gnome-vfs calls and +authentication. When calling a blocking operation like read() and it +results in a password dialog we have to start up a recursive mainloop +to display it. Not only is this unexpected for the application, it +also brings with it all the type of reentrancy issues that we had in +bonobo. Even worse, there is no way to make this threadsafe. To make +it threadsafe the callback would have to take the gdk lock before +doing any Gtk+ calls, but this would cause a deadlock if the +application called it with the gdk lock held. If we don't take the +gdk lock then you can't do blocking vfs calls on any thread but the +mainloop, or you have to take the gdk lock on any gnome-vfs call. + +The authentication callbacks can appear at *any* gnome-vfs entry +point, which makes it very hard to write gnome-vfs applications that +don't accidentally trigger a lot of authentication dialogs. For +instance, the tree sidebar in nautilus has to take particular care not +to stat or otherwise look at the toplevel items until the user +explicitly expands them, otherwise you'd get authentication dialogs +every time you opened a window. Its also easy to get multiple +authentication dialogs for the same entity. + +The way threads are used in gnome-vfs is problematic, both from the +point of view of writing backends, and for users of the library. For +users it forces the use of threading, even if the application doesn't +use the asynchronous calls that use the threading. It also enforces +the need for a gnome_vfs_init() function, as thread initialization +must be done very early. + +For backend implementations the use of threads forces every +backend to be threadsafe. Many of the backends are inherently single +threaded, either because they use non-threadsafe libraries like the +smb backend, or because the server being wrapped forces serialized +access (like an ftp backend where you really only want one connection +to the server). + +Backends run in context of the application using gnome-vfs, which can +be a gtk+ app, but as well a console application, so they have no +control or guarantee of their environment. For instance, they cannot +rely on the existance of a mainloop, so there is no way to use +e.g. timeouts to handle invalidation of caches. One way we have tried +to solve this is to move some backends to the gnome-vfs daemon, where +they can rely on the existance of the mainloop. + +Gnome-vfs use something called "gnome vfs uris" to identify +files. These are similar, but not entierly identical to the types of +uri used in webbrowsers. For instance, we often make us our own types +of URIs when there is no official standard for them (although such +standards might appear later, with incompatible behaviour). We also +have a "well defined" posix-like type of behaviour that isn't the same +as for web uris. The most extreme example would be mailto:, but even +things like ftp:// uris are different. The ftp uri rfc explains how +ftp:///dir/file.txt refers to $(login_dir)/dir/file.txt, and that you +have to use ftp:///%2fdir/file.txt to refer to the absolute path +/dir/file.txt on the server. Clearly we can't have pathname handling +semantics that vary depending on the backend (no app would get it +right), so we ignore the rfcs on this. + +Then there is the thing with escaping and unescaping uris. Although +technically not very complex it is just are very hard to get right all +the time. Among the most common questions on the gnome-vfs list is +what the various escape/unescape functions does, what arguments has to +be escaped, and how to display uris "nicely" (i.e. without escapes, +although that makes them invalid uris). This is made extra complicated +due to the poor handling of filename encodings and display names, and +the fact that only "less common" cases (like spaces in filenames) +break if you get it wrong. + +Last but not least, the fact that gnome-vfs uses something called a +"uri" gives people the wrong impression of what the library is +designed for. It causes people to complain when it doesn't have some +support for mailto: links, and it makes people want support for +cookies, extra http headers and other things typically used by a web +browser. This isn't really the kind of use that vfs is targeted at. A +library specific to that sort use would probably fit these apps much +better. + +Most gnome-vfs state is tied to the application that uses it, which I +think is quite unexpected by the user. For instance, when you log into +a network share in nautilus and then click on a file to open it, the +opening application will have to re-connect and re-authenticate to +the share, much to the users surprise. I really think most people +expect a login like that is somehow session global. We do sometimes +misuse gnome-keyring to "solve" the authentication issue, but even +then we still have multiple connections to the network share, which +can cause problem, for instance with ftp shares that use round-robin +dns where the mirrors aren't fully sync:ed up. Again, some backends +(smb) are now in the daemon which solves this issue. + +gnome_vfs_xfer() is possibly the worst-API call in the whole gnome +platform. Its a single, buggy, do-it-all function with shitloads of +combinations of flags and arguments to do all sort of things, with +little or no semantic specifications or testcases. Its also to a large +extent unnecessary for most applications and could easily be part of +the file manager instead of a generic library. I'm also not sure that +the "first do preflight calculation, then execute operation" model it +uses is right. It is inherently racy, since the target or source could +easily change during the preflight, and it makes error reporting and +handling much more complicated. + +The behaviour of symlink resolution in the UI has been discussed many +times. Should clicking on a symlink "foo" in $dir go to $dir/foo or to +the target directory. The Nautilus maintainers has decided that the +best way to approach this is to have symlinks be used for "filesystem +implementation" (like a symlink for /home -> /mnt/hdb2) and thus not +be resolved on activation. However, we should (this hasn't been +finished yet) support a different form of links (called "shortcuts" in +the UI) that always resolve on activation. At the moment there is no +support for anything like that in gnome-vfs, so we abuse desktop files +for this. We even generate virtual in-memory desktop files in the smb +backend to get this behaviour. Proper support for shortcuts in the +vfs API would let apps automatically work without ugly desktop file +hacks. + +Over the years gnome-vfs has accumulated a lot of cruft. It links to a +lot of libraries, including openssl, gconf+ORBit2, avahi, dbus, popt, +libxml, kerberos, libz and libresolv. Very few applications need all +of these, yet every application that uses gnome-vfs links to all of +them. Furthermore, some of the functionallity in gnome-vfs, like the +wrapper for dns-sd, resolving, network utilities, ls parsing +functions, ssl support, pty handling are perhaps not best suited for a +vfs library, nor do they always have great apis and quality +implementations. We could definately clean this up and minimize the +APIs. + +At some point in time gnome_vfs_uri_is_local() started detecting and +returning TRUE for NFS mounts and other type of local network +mounts. This is both slow and unexpected, and has led to problems and +unnecessary changes in many places. + +The way the cancellation API for asynchronous operations is set up +creates races and fragile code. The main issue is that if you call +cancel before the operation callback has been called the callback will +not be called. However, the callback typically wants to free some sort +of user_data object passed to it, so that has to be handled also when +you call cancellation. Couple this with the fact that there is no +destroy notifies and you can't cancel after the operation callback has +been called and you get an extremely tricky setup of combined +callbacks. Furthermore, if threads are used there are some inherent +races wrt detecting if the callback has been called when cancel is +called, making it essentially impossible to get this right. + +There are also a bunch of issues with the current gnome-vfs that could +technically be fixed like support for hidden file flags, +backend-extensible metadata, no standard vfs dialogs like progress +bars, etc. + +Last week I started thinking about a new design for a gnome-vfs +replacement that would solve most of these issues, and at the same +time gives a correct ordering of the platform stack. I've come up with +a highlevel architecture that I think will work, even though I haven't +yet finished it in detail or gotten the API totally worked out. Its +somewhat of a radical departure from gnome-vfs as it is today, so +brace with me as I try to explain the model and the ideas behind it. + +The gnome-vfs model is what I would call stateless. You can at any +time throw a URI at it and it will do everything required to access +the location. There is no need to, nor is there a way to set up +anything like a "session" with a remote share. Of course, in practice +this is not the way network shares work, so all sorts of session +initiation, caching and other magic happens under the covers to make +it look stateless. This is the source of all the problems with the +gnome-vfs authentication model. + +I'd like to propose using a stateful model, where you have to +explicitly initiate a session ("mount" a share) before you can start +accessing files. This will give a well specified time when all forms +of authentication will happen, when applications expect it and when +they can use a more expressive and suitable API for this kind of +operation. The actual i/o operations will then never cause any sort of +authentication issues, and can thus be purely non-graphical +(i.e. glib-only apps can do i/o). I imagine all/most actual mounting +of shares will happen in the file manager and the file selector, or at +gnome-session startup, so applications don't really need to handle +this themselves. + +Not only is the model stateful. I'd like all state to be session +global. That is, all mounts and network connections are shared between +all applications in the session. So, if you pass a file reference from +one app to another there is no need to log in again or anything like +that. I think this is what users expect. + +Having a global stateful model means all non-local vfs accesses go +through the vfs daemon. This works pretty well with the smb backend in +the current gnome-vfs, and smb is the backend most likely to have high +bandwidth traffic, so this doesn't seem to be a large performance +problem. Although we do have to take the performance aspect into +consideration when designing the daemon. + +In order to avoid all the problems with threading described above the +vfs daemon will not use threads. In fact, I think the best approach is +to let each active mountpoint be its own process. That way we get +robustness (one mount can't crash the others) and simplify the backend +creation greatly (each backend fully controls its context). It also +will let us do concurrent access to e.g. two smb shares (like a copy +from one to the other). We can't really do this atm since the thread +lock in the smb backend serializes such access. But with two smb +processes this is not a problem. + +There might be an issue with using separate processes for the +mountpoints bloating up the desktop, but I don't think that it will be +much of a problem. None of these processes will use the gui libraries +that are the real sources of unshared dirty memory use. I tried a +simple process that just used gobject and ran a mainloop. It only used +78k of dirty memory. Also, each server need only link to and +initialize the few libraries it needs, further keeping memory use down +and avoiding bloat in all applications (e.g. apps need not link to +openssl). + +As a consequence of the stateful model we don't need the stateless +properties that URIs has as identifier. To avoid all the problems +comming from the use of URIs we use a much simpler form of +identifier. Namely filenames, in a hierarchical tree with +mountpoints. These filenames are from an extended set of strings that +includes the set of normal filenames, but also includes some platform +dependent extensions. On win32 the full set might be some form of +stringified version of the ITEMIDLIST from the windows shell api, and +on unix we would use some out of band prefix to mark a non-local +filename. + +For example, we could be to use "//.network/" as a prefix for the vfs +filename namespace. A smb share might then be accessed as +"//.network/smb/computer:share/dir/file.txt", or a ftp share as +"//.network/ftp/alex@ftp.gnome.org/dir/file.txt". With a setup like +"//.network/$method/$mount_object/" it would be quite easy to find the +process handling the mount. Just ask for a dbus named object like +"org.glib.gvsf.smb.computer:share". It is also very easy to detect +local filenames and short-circuit to in-process i/o. + +These filenames would be the real identifier for the files, and as +such not really presentable to the user as it. You'd need to ask for +the display name via the vfs to get a user readable utf8-encoded +string for display. + +The set of operations on files and folders would be both simplified +and extended. We'd remove complicated things like read+write access to +a file, and give less posix-like guarantees. We also make seek and +truncate support optional in the backend. But then we will extend the +set of operations possible to allow things like copy on the remote +side (to avoid a download+upload operation on copy) and to have +a set of highlevel operations that applications want, like "save" that +implements the best way to save for each particular backend. + +We support metadata like display name, mimetypes, icon, and some +general information like length and mtime. But we make support for +getting the full "struct stat" buffer backend optional, as that isn't +a good abstraction for most backends. Also, the API will be designed +on the idea that network latency is expensive, so that there will only +be one call to stat() or readdir() needed to read all the metadata +requested by the application. (Whereas posix will have readdir return +only the names and force you to stat each file in a separate +roundtrip.) + +We likely don't want the full gnome/unix vfs implementation in +glib, instead glib will only ship an implementation of the vfs API for +local file access, and one that communicates to the vfs +daemon(s). Then we ship the daemon and the implementations of the +various backends externally. + +We will also write a single gnome-vfs backend that allows access to +all the glib vfs shares by using a uri like gvfs:///XXX that just maps +to //.network/XXX. We can also implement a similar backend for kio so +that kde applications can read and write to the shares. + +Furthermore, if FUSE is supported on the system we can write a FUSE +filesystem so that we can access the files as $HOME/.network/XXX. This +can be made extra nice if the application (like e.g. acrobat) uses +the gtk+ file selector but not the vfs by having the file selector +detect a filename like this and reverse-mapping it into a vfs pathname +and use the vfs for folder access. + +I've been doing some initial sketching of the glib API, and I've +started by introducing base GInputStream and GOutputStream similar to +the stream objects in Java and .Net. These have async i/o support and +will make the API for reading and writing files nicer and more +modern. There is also a GSeekable interface that streams can +optionally implement if they support seeking. + +I've also introduced a GFile object that wraps a file path. This means +you don't have to do tedious string operations on the pathnames to +navigate the filesystem. It also means we can use the openat() class +of file operations when traversing the filesystem tree, avoiding some +forms of races when we do things like recursive copies. + +To support the stateful model and still have some form of caching we +will also need to add some cache specific api so that you can trigger +a reload of information from a directory. Otherwise a reload operation +in the file manager wouldn't always get the latest state on something +like a ftp share where we cache things aggressively. + +I have some initial code here for some of the basic APIs, but its far +from finished and I'd like to spend some more time working on it +before I present it. However, I think the general architecture is +pretty sound and in a state where it can be discussed. + +Hopefully this description of my plans is enought to make people +understand some of my ideas and allow us to start and discussion about +the future of gnome-vfs. Also, consider it a heads up that I and other +people will likely be working on this this in the future. diff --git a/txt/vfs-names.txt b/txt/vfs-names.txt new file mode 100644 index 00000000..25833d7c --- /dev/null +++ b/txt/vfs-names.txt @@ -0,0 +1,142 @@ +Local filenames (in utf8 mode) +1) standard: /etc/passwd +2) utf8 and spaces: "/tmp/a åäö.txt" (encoding==utf8) +3) latin-1 and spaces: "/tmp/a åäö.txt" (encoding==iso8859-1) +4) filename without encoding: "/tmp/bad:\001\010\011\012\013" (as a C string) +5) mountpoint: /mnt/cdrom (cd has title "CD Title") + +Ftp mount to ftp.gnome.org +(where filenames are stored as utf8, this is detected by using + ftp protocol extensions (there is an rfc) or by having the user + specify the encoding at mount time) + +6) normal dir: /pub/sources +7) valid utf8 name: /dir/a file öää.txt +8) latin-1 name: /dir/a file öää.txt + +Ftp mount to ftp.gnome.org (with filenames in latin-1) +9) latin-1 name: /dir/a file öää.txt + +backend that stores display name separate from real name. Examples +could be a flickr backend, a file backend that handles desktop files, +or a virtual location like computer:// (which is implemented using +virtual desktop files atm). + +10) /tmp/foo.desktop (with Name[en]="Display Name") + +special cases: +ftp names relative to login dir + +Places where display filenames (i.e utf-8 strings) are used: + +A) Absolute filename, for editing (nautilus text entry, file selector entry) +B) Semi-Absolute filename, for display (nautilus window title) +C) Relative file name, for display (in nautilus/file selector icon/list view) +D) Relative file name, for editing (rename in nautilus) +E) Relative file name, for creating absolute name (filename completion for a) + This needs to know the exact form of the parent (i.e. it differs for filename vs uri). + I won't list this below as its always the same as A from the last slash to the end. + +This is how these work with gnome-vfs uris: + + A B C D +1) file:///etc/passwd passwd passwd passwd +2) file:///tmp/a%20%C3%B6%C3%A4%C3%A4.txt a åäö.txt a åäö.txt a åäö.txt +3) file:///tmp/a%20%E5%E4%F6.txt a ???.txt a ???.txt (invalid unicode) a ???.txt +4) file:///tmp/bad%3A%01%08%09%0A%0B bad:????? bad:????? (invalid unicode) bad:????? +5) file:///mnt/cdrom CD Title (cdrom) CD Title (cdrom) CD Title +6) ftp://ftp.gnome.org/pub/sources sources on ftp.gnome.org sources sources +7) ftp://ftp.gnome.org/dir/a%20%C3%B6%C3%A4%C3%A4.txt a åäö.txt on ftp.gnome.org a åäö.txt a åäö.txt +8) ftp://ftp.gnome.org/dir/a%20%E5%E4%F6.txt a ???.txt on ftp.gnome.org a ???.txt (invalid unicode) a ???.txt +9) ftp://ftp.gnome.org/dir/a%20%E5%E4%F6.txt a åäö.txt on ftp.gnome.org a åäö.txt a åäö.txt +10)file:///tmp/foo.desktop Display Name Display Name Display Name + +The stuff in column A is pretty insane. It works fine as an identifier +for the computer to use, but nobody would want to have to type that in +or look at that all the time. That is why Nautilus also allows +entering some filenames as absolute unix pathnames, although not all +filenames can be specified this way. If used when possible the column +looks like this: + + A +1) /etc/passwd +2) /tmp/a åäö.txt +3) file:///tmp/a%20%E5%E4%F6.txt +4) file:///tmp/bad%3A%01%08%09%0A%0B +5) /mnt/cdrom +6) ftp://ftp.gnome.org/pub/sources +7) ftp://ftp.gnome.org/dir/a%20%C3%B6%C3%A4%C3%A4.txt +8) ftp://ftp.gnome.org/dir/a%20%E5%E4%F6.txt +9) ftp://ftp.gnome.org/dir/a%20%E5%E4%F6.txt +10)/tmp/foo.desktop + +As we see this helps for most normal local paths, but it becomes +problematic when the filenames are in the wrong encoding. For +non-local files it doesn't help at all. We still have to look at these +horrible escapes, even when we know the encoding of the filename. + +The examples 7-9 in this version shows the problem with URIs. Suppose +we allowed an invalid URI like "ftp://ftp.gnome.org/dir/a åäö.txt" +(utf8-encoded string). Given the state inherent in the mountpoint we +know what encoding is used for the ftp server, so if someone types it +in we know which file they mean. However, suppose someone pastes a URI +like that into firefox, or mails it to someone, now we can't +reconstruct the real valid URI anymore. If you drag and drop it +however, the code can send the real valid uri so that firefox can load +it correctly. + +So, this introduces two kinds of of URIs that are "mostly similar" but +breaks in many nonobvious cases. This is very unfortunate, imho not +acceptable. I think its ok to accept a URI typed in like +"ftp://ftp.gnome.org/dir/a åäö.txt" and convert it to the right uri, +but its not right to display such a uri in the nautilus location bar, +as that can result in that invalid uri getting into other places. + +Since I dislike showing invalid URIs in the UI I think it makes sense +to create a new absolute pathname display and entry format. Ideally +such a system should allow any ascii or utf8 local filename to be +represented as itself. Furthermore it would allow input of URIs, but +immediately convert them to the display format (similar to how +inputing a file:// uri in nautilus displays as a normal filename). + +One solution would be to use some other prefix than / for +non-local files, and to use some form of escaping only for non-utf8 +chars and non-printables. Here is an example: + + A +1) /etc/passwd +2) /tmp/a åäö.txt +3) /tmp/a \xE5\xE4\xF6.txt +4) /tmp/bad:\x01\x08\x09\x0A\x0B +5) /mnt/cdrom +6) :ftp:ftp.gnome.org/pub/sources +7) :ftp:ftp.gnome.org/dir/a åäö.txt +8) :ftp:ftp.gnome.org/dir/a \xE5\xE4\xF6.txt +9) :ftp:ftp.gnome.org/dir/a åäö.txt +10)/tmp/foo.desktop + +Under the hood this would use proper, valid escaped URIs. However, we +would display things in the UI that made some sense to users, only +falling back to escaping in the last possible case. + +The API could look something like: + +GFile *g_file_new_from_filename (char *filename); +GFile *g_file_new_from_uri (char *uri); +GFile *g_file_parse_display_name (char *display_name); + +Another approach (mentioned by Jürg Billeter on irc yesterday) is to +move from a pure textual representation of the full uri to a more +structured UI. For example the ftp://ftp.gnome.org/ part of the URI +could be converted to a single item in the entry looking like +[#ftp.gnome.org] (where # is an ftp icon). Then the rest of the entry +would edit just the path on the ftp server, as a local filename. The +disadvantage here is that its a bit harder to know how to type in a +full pathname including what method to use and what server (you'd type +in a URI). This isn't necessarily a huge problem if you rarely type in +remote URIs (instead you can follow links, browse the network, add +favourites, etc). + +I don't know how hard this is to do from a Gtk+ perspective +though. Its somewhat similar to what the evolution address entry does. + |