Oops.

author: Darin Adler <darin@src.gnome.org> 2001-08-22 00:30:10 +0000
committer: Darin Adler <darin@src.gnome.org> 2001-08-22 00:30:10 +0000
commit: 05e1d3ef6ac74a5523cd1001f0412631c58f6f6c (patch)
tree: dd38e9336b5b9532aaa6c1a12250f4ce7d3e4694 /docs
parent: c7ca23eef98deec74769c9ddbd3f7fad366c05a6 (diff)
download: nautilus-05e1d3ef6ac74a5523cd1001f0412631c58f6f6c.tar.gz
1 files changed, 231 insertions, 1 deletions
diff --git a/docs/nautilus-io.txt b/docs/nautilus-io.txt
index 0e0df56f1..167c5887d 100644
--- a/docs/nautilus-io.txt
+++ b/docs/nautilus-io.txt
@@ -1 +1,231 @@
-Nautilus I/O Primer
-draft ("Better Than Nothing")
-2001-08-21
-Darin Adler <darin@bentspoon.com>
-The Nautilus shell, and the file manager inside it, does a lot of
-I/O. Because of this, there are some special disciplines required when
-writing Nautilus code.
-No I/O on the main thread
-To be able to respond to the user quickly, Nautilus needs to be
-designed so that the main user input thread does not block. The basic
-approach is to never do any disk I/O on the main thread.
-In practice, Nautilus code does assume that some disk I/O is fast, in
-some cases intentionally and in other cases due to programmer
-sloppiness. The typical assumption is that reading files from the
-user's home directory and the installed files in the Nautilus datadir
-are very fast, effectively instantaneous.
-So the general approach is to allow I/O for files that have file
-system paths, assuming that the access to these files is fast, and to
-prohibit I/O for files that have arbitrary URIs, assuming that access
-to these could be arbitrarily slow. Although this works pretty well,
-it is based on an incorrect assumption, because with NFS and other
-kinds of abstract file systems, there can be arbitrarily slow parts of
-the file system that have file system paths.
-For historical reasons, threading in Nautilus is done through the
-gnome-vfs asynchronous I/O abstraction rather than using threads
-directly. This means that all the threads are created by gnome-vfs,
-and Nautilus code runs on the main thread only. Thus, the rule of
-thumb is that synchronous gnome-vfs operations, like the ones in
-<libgnomevfs/gnome-vfs-ops.h> are illegal in most Nautilus
-code. Similarly, it's illegal to ask for a piece of information, say a
-file size, and then wait until it arrives. The program's main thread
-must be allowed to get back to the main loop and start asking for user
-input again.
-How NautilusFile is used to do this
-The NautilusFile class presents an API for scheduling this
-asynchronous I/O, and dealing with the uncertainty of when the
-information will be available. (It also does a few other things, but
-that's the main service it provides.) When you want information about
-a particular file or directory, you get the NautilusFile object for
-that item, using the nautilus_file_get. This operation, like most
-NautilusFile operations, is not allowed to do any disk I/O. Once you
-have a NautilusFile object, you can ask it questions like "What is
-your file type?" by calling functions like
-nautilus_file_get_file_type. However, in a newly created NautilusFile
-object, the answer is almost certainly "I don't know." Each function
-defines a default, which is the answer given for "I don't know." For
-example, nautilus_file_get_type will return
-GNOME_VFS_FILE_TYPE_UNKNOWN if it doesn't yet know the type.
-It's worth taking a side trip to discuss the nature of the
-NautilusFile API. Since these classes are a private part of the
-Nautilus implementation, we make no effort to have the API be
-"complete" in an abstract sense. Instead we add operations as
-necessary and give them the semantics that are most handy for our
-purposes. For example, we could have a nautilus_file_get_size that
-returns a special distinguishable value to mean "I don't know" or a
-separate boolean instead of returning 0 for files where the size is
-unknown. This is entirely motivated by pragmatic concerns. The intent
-is that we tweak these calls as needed if the semantics aren't good
-enough.
-Back to the newly created NautilusFile object. If you actually need to
-get the type, you need to arrange for that information to be fetched
-from the file system. There are two ways to make this request. If you
-are planning to display the type on an ongoing basis, then you want to
-tell the NautilusFile that you'll be monitoring the type and want to
-know about changes to it. If you just need one-time information about
-the type then you'll want to be informed when the type is
-discovered. The calls used for this are nautilus_file_monitor_add and
-nautilus_file_call_when_ready respectively. Both of these calls take a
-list of information needed about a file. If all you need is the file
-type, for example, you would pass a list containing just
-NAUTILUS_FILE_ATTRIBUTE_FILE_TYPE (the attributes are defined in
-nautilus-file-attributes.h). Not every call has a corresponding file
-attribute type. We add new ones as needed.
-If you do a nautilus_file_monitor_add, you also typically connect to
-the NautilusFile object's changed signal. Each time any monitored
-attribute changes, a changed signal is emitted. The caller typically
-caches the value of the attribute that was last seen (for example,
-what's displayed on screen) and does a quick check to see if the
-attribute it cares about has changed. If you do a
-nautilus_file_call_when_ready, you don't typically need to connect to
-the changed signal, because your callback function will be called when
-and if the requested information is ready.
-Both a monitor and a call when ready can be cancelled. For ease of
-use, neither call requires that you store an ID for
-canceling. Instead, the monitor function uses an arbitrary client
-pointer, which can be any kind of pointer that's known to not conflict
-with other monitorers. Usually, this is a pointer to the monitoring
-object, but it can also be, for example, a pointer to a global
-variable. The call_when_ready function uses the callback and callback
-data to identify the particular callback. One advantage of the monitor
-API is that it also lets the NautilusFile framework know that the file
-should be monitored for changes made outside Nautilus. This is how we
-know when to ask FAM to monitor a file for us.
-Lets review a few of the concepts:
-1) Nearly all NautilusFile operations, like nautilus_file_get_type,
-   are not allowed to do any disk I/O.
-2) To cause the actual I/O to be done, callers need to use either a
-   monitor or a call when ready.
-3) The actual I/O is done by asynchronous gnome-vfs calls, and this is
-   done on another thread.
-When working with an entire directory of files at once, you work with
-a NautilusDirectory object. With the NautilusDirectory object you can
-monitor a whole set of NautilusFile objects at once, and you can
-connect to a single "files_changed" signal that gets emitted whenever
-files within the directory are modified. That way you don't have to
-connect separately to each file you want to monitor. These calls are
-also the mechanism for finding out which files are in a directory. In
-most other respects, they are like the NautilusFile calls.
-Caching, the good and the bad
-Another feature of the NautilusFile class is the caching. If you keep
-around a NautilusFile object, it keeps around information about the
-last known state of that file. Thus, if you call
-nautilus_file_get_type, you might well get file type of the file found
-at this location the last time you looked, rather than the information
-about what the file type is now, or "unknown". There are some problems
-with this, though.
-The first problem is that if wrong information is cached, you need
-some way to "goose" the NautilusFile object and get it to grab new
-information. This is trickier than it might sound, because we don't
-want to constantly distrust information we received just moments
-before. To handle this, we have the
-nautilus_file_invalidate_attributes and
-nautilus_file_invalidate_all_attributes calls, as well as the
-nautilus_directory_force_reload call. If some code in Nautilus makes a
-change to a file that's known to affect the cached information, it can
-call one of these to inform the NautilusFile framework. Changes that
-are made through the framework itself are automatically understood, so
-usually these calls aren't necessary.
-The second problem is that it's hard to predict when information will
-and won't be cached. The current rule that's implemented is that no
-information is cached if no one retains a reference to the
-NautilusFile object. This means that someone else holding a
-NautilusFile object can subtly affect the semantics of whether you
-have new data or not. Calling nautilus_file_call_when_ready or
-nautilus_file_monitor_add will not invalidate the cache, but rather
-will return you the already cached information.
-These problems are less pronounced when FAM is in use. With FAM, any
-monitored file is highly likely to have accurate information, because
-changes to the file will be noticed by FAM, and that in turn will
-trigger new I/O to determine what the new status of the file is.
-Operations that change the file
-You'll note that up until this point, I've only discussed getting
-information about the file, not making changes to it. NautilusFile
-also contains some APIs for making changes. There are two kinds of
-these.
-The calls that change metadata are an example of the first kind. These
-calls make changes to the internal state right away, and schedule I/O
-to write the changes out to the file system. There's no way to detect
-if the I/O succeeds or fails, and as far as the client code is
-concerned, the change takes place right away.
-The calls that make other kinds of file system change are an example
-of the second kind. These calls take a
-NautilusFileOperationCallback. They are all cancellable, and they give
-the callback when the operation completes, whether it succeeds or
-fails.
-Icons
-The current implementation of the Nautilus icon factory uses
-synchronous I/O to get the icons and ignores these guidelines. The
-only reason this doesn't ruin the Nautilus user experience is that it
-also refuses to even try to fetch icons from URIs that don't
-correspond to file system paths, which for most cases means it limits
-itself to reading from the high-speed local disk. Don't ask me what
-the repercussions of this are for NFS; do the research and tell me
-instead!
-Slowness caused by asynchronous operations
-The danger in all this asynchronous I/O is that you might end up doing
-lots of user interface tasks twice. If you go to display a file right
-after asking for information about it, you might immediately show an
-"unknown file type" icon. Then, milliseconds later, you may complete
-the I/O and discover more information about the file, including the
-appropriate icon. So you end up drawing everything twice. There are a
-number of strategies for preventing this problem. One of them is to
-allow a bit of hysteresis, and wait some fixed amount of time after
-requesting the I/O before displaying the "unknown" state. [What
-strategy is used in Nautilus right now?]
-How to make Nautilus slow
-If you add I/O to the functions in NautilusFile that are used simply
-to fetch cached file information, you can make Nautilus incredibly I/O
-intensive. On the other hand, the NautilusFile API does not provide a
-way to do arbitrary file reads, for example. So it can be tricky to
-add features to Nautilus, since you first have to educate NautilusFile
-about how to do the I/O asynchronously and cache it, then request the
-information and have some way to deal with the time when it's not yet
-known.
-Adding new kinds of I/O usually involves working on the Nautilus I/O
-state machine in nautilus-directory-async.c. If we changed Nautilus to
-use threading instead of using gnome-vfs asychronous operations, I'm
-pretty sure that most of the changes would be here in this
-file. That's because the external API used for NautilusFile wouldn't
-really have a reason to change. In either case, you'd want to schedule
-work to be done, and get called back when the work is complete.
-[We probably need more about nautilus-directory-async.c here.]
-That's all for now
-This is a very rough early draft of this document. Let me know about
-other topics that would be useful to be covered in here.
-    -- Darin
-\ No newline at end of file
+Nautilus I/O Primer
+draft ("Better Than Nothing")
+2001-08-21
+Darin Adler <darin@bentspoon.com>
+
+The Nautilus shell, and the file manager inside it, does a lot of
+I/O. Because of this, there are some special disciplines required when
+writing Nautilus code.
+
+No I/O on the main thread
+
+To be able to respond to the user quickly, Nautilus needs to be
+designed so that the main user input thread does not block. The basic
+approach is to never do any disk I/O on the main thread.
+
+In practice, Nautilus code does assume that some disk I/O is fast, in
+some cases intentionally and in other cases due to programmer
+sloppiness. The typical assumption is that reading files from the
+user's home directory and the installed files in the Nautilus datadir
+are very fast, effectively instantaneous.
+
+So the general approach is to allow I/O for files that have file
+system paths, assuming that the access to these files is fast, and to
+prohibit I/O for files that have arbitrary URIs, assuming that access
+to these could be arbitrarily slow. Although this works pretty well,
+it is based on an incorrect assumption, because with NFS and other
+kinds of abstract file systems, there can be arbitrarily slow parts of
+the file system that have file system paths.
+
+For historical reasons, threading in Nautilus is done through the
+gnome-vfs asynchronous I/O abstraction rather than using threads
+directly. This means that all the threads are created by gnome-vfs,
+and Nautilus code runs on the main thread only. Thus, the rule of
+thumb is that synchronous gnome-vfs operations, like the ones in
+<libgnomevfs/gnome-vfs-ops.h> are illegal in most Nautilus
+code. Similarly, it's illegal to ask for a piece of information, say a
+file size, and then wait until it arrives. The program's main thread
+must be allowed to get back to the main loop and start asking for user
+input again.
+
+How NautilusFile is used to do this
+
+The NautilusFile class presents an API for scheduling this
+asynchronous I/O, and dealing with the uncertainty of when the
+information will be available. (It also does a few other things, but
+that's the main service it provides.) When you want information about
+a particular file or directory, you get the NautilusFile object for
+that item, using the nautilus_file_get. This operation, like most
+NautilusFile operations, is not allowed to do any disk I/O. Once you
+have a NautilusFile object, you can ask it questions like "What is
+your file type?" by calling functions like
+nautilus_file_get_file_type. However, in a newly created NautilusFile
+object, the answer is almost certainly "I don't know." Each function
+defines a default, which is the answer given for "I don't know." For
+example, nautilus_file_get_type will return
+GNOME_VFS_FILE_TYPE_UNKNOWN if it doesn't yet know the type.
+
+It's worth taking a side trip to discuss the nature of the
+NautilusFile API. Since these classes are a private part of the
+Nautilus implementation, we make no effort to have the API be
+"complete" in an abstract sense. Instead we add operations as
+necessary and give them the semantics that are most handy for our
+purposes. For example, we could have a nautilus_file_get_size that
+returns a special distinguishable value to mean "I don't know" or a
+separate boolean instead of returning 0 for files where the size is
+unknown. This is entirely motivated by pragmatic concerns. The intent
+is that we tweak these calls as needed if the semantics aren't good
+enough.
+
+Back to the newly created NautilusFile object. If you actually need to
+get the type, you need to arrange for that information to be fetched
+from the file system. There are two ways to make this request. If you
+are planning to display the type on an ongoing basis, then you want to
+tell the NautilusFile that you'll be monitoring the type and want to
+know about changes to it. If you just need one-time information about
+the type then you'll want to be informed when the type is
+discovered. The calls used for this are nautilus_file_monitor_add and
+nautilus_file_call_when_ready respectively. Both of these calls take a
+list of information needed about a file. If all you need is the file
+type, for example, you would pass a list containing just
+NAUTILUS_FILE_ATTRIBUTE_FILE_TYPE (the attributes are defined in
+nautilus-file-attributes.h). Not every call has a corresponding file
+attribute type. We add new ones as needed.
+
+If you do a nautilus_file_monitor_add, you also typically connect to
+the NautilusFile object's changed signal. Each time any monitored
+attribute changes, a changed signal is emitted. The caller typically
+caches the value of the attribute that was last seen (for example,
+what's displayed on screen) and does a quick check to see if the
+attribute it cares about has changed. If you do a
+nautilus_file_call_when_ready, you don't typically need to connect to
+the changed signal, because your callback function will be called when
+and if the requested information is ready.
+
+Both a monitor and a call when ready can be cancelled. For ease of
+use, neither call requires that you store an ID for
+canceling. Instead, the monitor function uses an arbitrary client
+pointer, which can be any kind of pointer that's known to not conflict
+with other monitorers. Usually, this is a pointer to the monitoring
+object, but it can also be, for example, a pointer to a global
+variable. The call_when_ready function uses the callback and callback
+data to identify the particular callback. One advantage of the monitor
+API is that it also lets the NautilusFile framework know that the file
+should be monitored for changes made outside Nautilus. This is how we
+know when to ask FAM to monitor a file for us.
+
+Lets review a few of the concepts:
+
+1) Nearly all NautilusFile operations, like nautilus_file_get_type,
+   are not allowed to do any disk I/O.
+2) To cause the actual I/O to be done, callers need to use either a
+   monitor or a call when ready.
+3) The actual I/O is done by asynchronous gnome-vfs calls, and this is
+   done on another thread.
+
+When working with an entire directory of files at once, you work with
+a NautilusDirectory object. With the NautilusDirectory object you can
+monitor a whole set of NautilusFile objects at once, and you can
+connect to a single "files_changed" signal that gets emitted whenever
+files within the directory are modified. That way you don't have to
+connect separately to each file you want to monitor. These calls are
+also the mechanism for finding out which files are in a directory. In
+most other respects, they are like the NautilusFile calls.
+
+Caching, the good and the bad
+
+Another feature of the NautilusFile class is the caching. If you keep
+around a NautilusFile object, it keeps around information about the
+last known state of that file. Thus, if you call
+nautilus_file_get_type, you might well get file type of the file found
+at this location the last time you looked, rather than the information
+about what the file type is now, or "unknown". There are some problems
+with this, though.
+
+The first problem is that if wrong information is cached, you need
+some way to "goose" the NautilusFile object and get it to grab new
+information. This is trickier than it might sound, because we don't
+want to constantly distrust information we received just moments
+before. To handle this, we have the
+nautilus_file_invalidate_attributes and
+nautilus_file_invalidate_all_attributes calls, as well as the
+nautilus_directory_force_reload call. If some code in Nautilus makes a
+change to a file that's known to affect the cached information, it can
+call one of these to inform the NautilusFile framework. Changes that
+are made through the framework itself are automatically understood, so
+usually these calls aren't necessary.
+
+The second problem is that it's hard to predict when information will
+and won't be cached. The current rule that's implemented is that no
+information is cached if no one retains a reference to the
+NautilusFile object. This means that someone else holding a
+NautilusFile object can subtly affect the semantics of whether you
+have new data or not. Calling nautilus_file_call_when_ready or
+nautilus_file_monitor_add will not invalidate the cache, but rather
+will return you the already cached information.
+
+These problems are less pronounced when FAM is in use. With FAM, any
+monitored file is highly likely to have accurate information, because
+changes to the file will be noticed by FAM, and that in turn will
+trigger new I/O to determine what the new status of the file is.
+
+Operations that change the file
+
+You'll note that up until this point, I've only discussed getting
+information about the file, not making changes to it. NautilusFile
+also contains some APIs for making changes. There are two kinds of
+these.
+
+The calls that change metadata are an example of the first kind. These
+calls make changes to the internal state right away, and schedule I/O
+to write the changes out to the file system. There's no way to detect
+if the I/O succeeds or fails, and as far as the client code is
+concerned, the change takes place right away.
+
+The calls that make other kinds of file system change are an example
+of the second kind. These calls take a
+NautilusFileOperationCallback. They are all cancellable, and they give
+the callback when the operation completes, whether it succeeds or
+fails.
+
+Icons
+
+The current implementation of the Nautilus icon factory uses
+synchronous I/O to get the icons and ignores these guidelines. The
+only reason this doesn't ruin the Nautilus user experience is that it
+also refuses to even try to fetch icons from URIs that don't
+correspond to file system paths, which for most cases means it limits
+itself to reading from the high-speed local disk. Don't ask me what
+the repercussions of this are for NFS; do the research and tell me
+instead!
+
+Slowness caused by asynchronous operations
+
+The danger in all this asynchronous I/O is that you might end up doing
+lots of user interface tasks twice. If you go to display a file right
+after asking for information about it, you might immediately show an
+"unknown file type" icon. Then, milliseconds later, you may complete
+the I/O and discover more information about the file, including the
+appropriate icon. So you end up drawing everything twice. There are a
+number of strategies for preventing this problem. One of them is to
+allow a bit of hysteresis, and wait some fixed amount of time after
+requesting the I/O before displaying the "unknown" state. [What
+strategy is used in Nautilus right now?]
+
+How to make Nautilus slow
+
+If you add I/O to the functions in NautilusFile that are used simply
+to fetch cached file information, you can make Nautilus incredibly I/O
+intensive. On the other hand, the NautilusFile API does not provide a
+way to do arbitrary file reads, for example. So it can be tricky to
+add features to Nautilus, since you first have to educate NautilusFile
+about how to do the I/O asynchronously and cache it, then request the
+information and have some way to deal with the time when it's not yet
+known.
+
+Adding new kinds of I/O usually involves working on the Nautilus I/O
+state machine in nautilus-directory-async.c. If we changed Nautilus to
+use threading instead of using gnome-vfs asychronous operations, I'm
+pretty sure that most of the changes would be here in this
+file. That's because the external API used for NautilusFile wouldn't
+really have a reason to change. In either case, you'd want to schedule
+work to be done, and get called back when the work is complete.
+
+[We probably need more about nautilus-directory-async.c here.]
+
+That's all for now
+
+This is a very rough early draft of this document. Let me know about
+other topics that would be useful to be covered in here.
+
+    -- Darin
author	Darin Adler <darin@src.gnome.org>	2001-08-22 00:30:10 +0000
committer	Darin Adler <darin@src.gnome.org>	2001-08-22 00:30:10 +0000
commit	05e1d3ef6ac74a5523cd1001f0412631c58f6f6c (patch)
tree	dd38e9336b5b9532aaa6c1a12250f4ce7d3e4694 /docs
parent	c7ca23eef98deec74769c9ddbd3f7fad366c05a6 (diff)
download	nautilus-05e1d3ef6ac74a5523cd1001f0412631c58f6f6c.tar.gz