summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorSteven Danna <steve@chef.io>2016-12-13 17:57:08 +0000
committerSteven Danna <steve@chef.io>2016-12-13 17:57:08 +0000
commitc20f0740ce2fce5f3968e9f18a2ac8ac6c9b779f (patch)
tree3eb30cfe83a866ae6dc2e1718647367ae7fbf20c
parentc56cf68d67e01c0923009b58e01d24ce0efe537d (diff)
downloadchef-ssd/chef-fs-n-squared.tar.gz
[cheffs] Don't iterate parent object on exist? callssd/chef-fs-n-squared
Previously, this code determined if an object existed by doing the following: parent.children.any? { |child| child.api_child_name == api_child_name } For organizations and object types with a small number of total objects, this wasn't problematic; however, it has very bad worst-case behavior. For example, if a user was attempting to restore an organization with 30k client records to an empty organization, each client upload would do the following: 1. Run GET /clients 2. Iterate the list returned in (1) normalizing the clients names along the way. 3. Upload the client via POST when exist? returned false. When the clients don't exist, this means step (2) will always iterate over every member returned in step (1). By the time you get into the 1000s of clients, this iteration dominates the running time of the process. For instance, consider the following ruby profile data: Measure Mode: wall_time Thread ID: 16939380 Fiber ID: 22163920 Total: 109.860468 Sort by: self_time %self total self wait child calls name 26.83 29.475 29.475 0.000 0.000 500 <Class::IO>#select 8.60 38.876 9.446 0.000 29.430 464125 <Class::Chef::ChefFS::PathUtils>#join 7.02 12.226 7.717 0.000 4.509 928750 Chef::ChefFS::FileSystem::ChefServer::RestListEntry#api_child_name 6.79 7.459 7.459 0.000 0.000 930250 String#gsub 5.31 5.834 5.834 0.000 0.000 2320625 <Class::Chef::ChefFS::PathUtils>#regexp_path_separator 3.33 3.654 3.654 0.000 0.000 500 OpenSSL::X509::Store#set_default_paths 3.28 42.482 3.606 0.000 38.876 464125 Chef::ChefFS::FileSystem::BaseFSObject#initialize 2.95 3.244 3.244 0.000 0.000 930250 <Class::File>#extname 2.29 48.421 2.513 0.000 45.908 483375 *Class#new 38 wall clock seconds spent in PathUtils.join (part of the name normalization during the list walk) Note, an alternative might be to skip the exist? check completely, opting to blindly POST and then rescuing the 409 and retry with a PUT. Experimentation shows that PathUtils.join can also be improved substantially; however, I'll leave those for a follow-up PR. Signed-off-by: Steven Danna <steve@chef.io>
-rw-r--r--lib/chef/chef_fs/file_system/chef_server/rest_list_entry.rb9
1 files changed, 8 insertions, 1 deletions
diff --git a/lib/chef/chef_fs/file_system/chef_server/rest_list_entry.rb b/lib/chef/chef_fs/file_system/chef_server/rest_list_entry.rb
index b8ec5f8524..eb793d5e87 100644
--- a/lib/chef/chef_fs/file_system/chef_server/rest_list_entry.rb
+++ b/lib/chef/chef_fs/file_system/chef_server/rest_list_entry.rb
@@ -69,7 +69,14 @@ class Chef
def exists?
if @exists.nil?
begin
- @exists = parent.children.any? { |child| child.api_child_name == api_child_name }
+ rest.get(api_path)
+ @exists = true
+ rescue Net::HTTPServerException => e
+ if e.response.code == "404"
+ @exists = false
+ else
+ raise
+ end
rescue Chef::ChefFS::FileSystem::NotFoundError
@exists = false
end