summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorSam Thursfield <sam.thursfield@codethink.co.uk>2014-10-14 16:41:16 +0100
committerSam Thursfield <sam.thursfield@codethink.co.uk>2014-10-14 16:41:16 +0100
commitc11bcfcd39bd9c9e30184ea29d21ef52624d056a (patch)
tree8b4fbe74ced0b68ced598e42c9f19182beea73ba
downloadimport-c11bcfcd39bd9c9e30184ea29d21ef52624d056a.tar.gz
Initial import of Baserock import tool for importing foreign packaging
-rw-r--r--README100
-rw-r--r--README.omnibus17
-rw-r--r--README.rubygems52
-rw-r--r--importer_base.py72
-rw-r--r--importer_base.rb81
-rw-r--r--main.py920
-rwxr-xr-xomnibus.to_chunk274
-rwxr-xr-xomnibus.to_lorry94
-rw-r--r--omnibus.yaml7
-rwxr-xr-xrubygems.to_chunk275
-rwxr-xr-xrubygems.to_lorry164
-rw-r--r--rubygems.yaml49
12 files changed, 2105 insertions, 0 deletions
diff --git a/README b/README
new file mode 100644
index 0000000..3ac7997
--- /dev/null
+++ b/README
@@ -0,0 +1,100 @@
+How to use the Baserock Import Tool
+===================================
+
+The tool helps you generate Baserock build instructions by importing metadata
+from a foreign packaging system.
+
+The process it follows is this:
+
+1. Pick a package from the processing queue.
+2. Find its source code, and generate a suitable .lorry file.
+3. Make it available as a local Git repo.
+4. Check out the commit corresponding to the requested version of the package.
+5. Analyse the source tree and generate a suitable chunk .morph to build the
+ requested package.
+6. Analyse the source tree and generate a list of dependencies for the package.
+7. Enqueue any new dependencies, and repeat.
+
+Once the queue is empty:
+
+8. Generate a stratum .morph for the package(s) the user requested.
+
+The tool is not magic. It can be taught the conventions for each packaging
+system, but these will not work in all cases. When an import fails it will
+continue to the next package, so that the first run does as many imports as
+possible.
+
+For imports that could not be done automatically, you will need to write an
+appropriate .lorry or .morph file manually and rerun the tool. It will resume
+processing where it left off.
+
+It's possible to teach the code about more conventions, but it is only
+worthwhile to do that for common patterns.
+
+
+Package-system specific code and data
+-------------------------------------
+
+For each supported package system, there should be an xxx.to_lorry program, and
+a xxx.to_chunk program. These should output on stdout a .lorry file and a .morph
+file, respectively.
+
+Each packaging system can have static data saved in a .yaml file, for known
+metadata that the programs cannot discover automatically.
+
+The following field should be honoured by most packaging systems:
+`known-source-uris`. It maps package name to source URI.
+
+
+Help with .lorry generation
+---------------------------
+
+The simplest fix is to add the source to the 'known-source-uris` dict in the
+static metadata.
+
+If you write a .lorry file by hand, be sure to fill in the `x-products-YYY`
+field. 'x' means this field is an extension to the .lorry format. YYY is the
+name of the packaging system, e.g. 'rubygems'. It should contain a list of
+which packages this repository contains the source code for.
+
+
+Help with linking package version to Git tag
+--------------------------------------------
+
+Some projects do not tag releases.
+
+Currently, you must create a tag in the local checkout for the tool to continue.
+In future, the Lorry tool should be extended to handle creation of missing
+tags, so that they are propagated to the project Trove. The .lorry file would
+need to contain a dict mapping product version number to commit SHA1.
+
+If you are in a hurry, you can use the `--use-master-if-no-tag` option. Instead
+of an error, the tool will use whatever is the `master` ref of the component
+repo.
+
+
+Help with chunk .morph generation
+---------------------------------
+
+If you create a chunk morph by hand, you must add some extra fields:
+
+ - `x-build-dependencies-YYY`
+ - `x-runtime-dependencies-YYY`
+
+These are a dict mapping dependency name to dependency version. For example:
+
+ x-build-dependencies-rubygems: {}
+ x-runtime-dependencies-rubygems:
+ hashie: 2.1.2
+ json: 1.8.1
+ mixlib-log: 1.6.0
+ rack: 1.5.2
+
+All dependencies will be included in the resulting stratum. Those which are build
+dependencies of other components will be added to the relevant 'build-depends'
+field.
+
+These fields are non-standard extensions to the morphology format.
+
+For more package-system specific information, see the relevant README file, e.g
+README.rubygems for RubyGem imports.
diff --git a/README.omnibus b/README.omnibus
new file mode 100644
index 0000000..840bbab
--- /dev/null
+++ b/README.omnibus
@@ -0,0 +1,17 @@
+Omnibus import
+==============
+
+See 'README' for general information on the Baserock Import Tool.
+
+To use
+------
+
+First, clone the Git repository corresponding to the Omnibus project you want
+to import. For example, if you want to import the Chef Server, clone:
+<https://github.com/opscode/omnibus-chef-server>
+
+As per Omnibus' instructions, you should then run `bundle install --binstubs`
+in the checkout to make available the various dependent repos and Gems of the
+project definitions.
+
+
diff --git a/README.rubygems b/README.rubygems
new file mode 100644
index 0000000..1afb62d
--- /dev/null
+++ b/README.rubygems
@@ -0,0 +1,52 @@
+Here is some information I have learned while importing RubyGem packages into
+Baserock.
+
+First, beware that RubyGem .gemspec files are actually normal Ruby programs,
+and are executed when loaded. A Bundler Gemfile is also a Ruby program, and
+could run arbitrary code when loaded.
+
+The Standard Case
+-----------------
+
+Most Ruby projects provide one or more .gemspec files, which describe the
+runtime and development dependencies of the Gem.
+
+Using the .gemspec file and the `gem build` command it is possible to create
+the .gem file. It can then be installed with `gem install`.
+
+Note that use of `gem build` is discouraged by its own help file in favour
+of using Rake, but there is much less standardisation among Rakefiles and they
+may introduce requirements on Hoe, rake-compiler, Jeweler or other tools.
+
+The 'development' dependencies includes everything useful to test, document,
+and create a Gem of the project. All we want to do is create a Gem, which I'll
+refer to as 'building'.
+
+
+Gem with no .gemspec
+--------------------
+
+Some Gems choose not to include a .gemspec, like [Nokigori]. In the case of
+Nokigori, and others, [Hoe] is used, which adds Rake tasks that create the Gem.
+The `gem build` command cannot not be used in these cases.
+
+You may be able to use the `rake gem` command instead of `gem build`.
+
+[Nokigori]: https://github.com/sparklemotion/nokogiri/blob/master/Y_U_NO_GEMSPEC.md
+[Hoe]: http://www.zenspider.com/projects/hoe.html
+
+
+Signed Gems
+-----------
+
+It's possible for a Gem maintainer to sign their Gems. See:
+
+ - <http://blog.meldium.com/home/2013/3/3/signed-rubygems-part>
+ - <http://www.ruby-doc.org/stdlib-1.9.3/libdoc/rubygems/rdoc/Gem/Security.html>
+
+When building a Gem in Baserock, signing is unnecessary because it's not going
+to be shared except as part of the build system. The .gemspec may include a
+`signing_key` field, which will be a local path on the maintainer's system to
+their private key. Removing this field causes an unsigned Gem to be built.
+
+Known Gems that do this: 'net-ssh' and family.
diff --git a/importer_base.py b/importer_base.py
new file mode 100644
index 0000000..5def0dc
--- /dev/null
+++ b/importer_base.py
@@ -0,0 +1,72 @@
+# Base class for import tools written in Python.
+#
+# Copyright (C) 2014 Codethink Limited
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License along
+# with this program; if not, write to the Free Software Foundation, Inc.,
+# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+
+import logging
+import os
+import sys
+
+
+class ImportException(Exception):
+ pass
+
+
+class ImportExtension(object):
+ '''A base class for import extensions.
+
+ A subclass should subclass this class, and add a ``process_args`` method.
+
+ Note that it is not necessary to subclass this class for import extensions.
+ This class is here just to collect common code.
+
+ '''
+
+ def __init__(self):
+ self.setup_logging()
+
+ def setup_logging(self):
+ '''Direct all logging output to MORPH_LOG_FD, if set.
+
+ This file descriptor is read by Morph and written into its own log
+ file.
+
+ This overrides cliapp's usual configurable logging setup.
+
+ '''
+ log_write_fd = int(os.environ.get('MORPH_LOG_FD', 0))
+
+ if log_write_fd == 0:
+ return
+
+ formatter = logging.Formatter('%(message)s')
+
+ handler = logging.StreamHandler(os.fdopen(log_write_fd, 'w'))
+ handler.setFormatter(formatter)
+
+ logger = logging.getLogger()
+ logger.addHandler(handler)
+ logger.setLevel(logging.DEBUG)
+
+ def process_args(self, args):
+ raise NotImplementedError()
+
+ def run(self):
+ try:
+ self.process_args(sys.argv[1:])
+ except ImportException as e:
+ sys.stderr.write('ERROR: %s\n' % e.message)
+ sys.exit(1)
diff --git a/importer_base.rb b/importer_base.rb
new file mode 100644
index 0000000..4e7a7b5
--- /dev/null
+++ b/importer_base.rb
@@ -0,0 +1,81 @@
+#!/usr/bin/env ruby
+#
+# Base class for importers written in Ruby
+#
+# Copyright (C) 2014 Codethink Limited
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License along
+# with this program; if not, write to the Free Software Foundation, Inc.,
+# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+require 'json'
+require 'logger'
+require 'optparse'
+require 'yaml'
+
+module Importer
+ class Base
+ private
+
+ def create_option_parser(banner, description)
+ opts = OptionParser.new
+
+ opts.banner = banner
+
+ opts.on('-?', '--help', 'print this help') do
+ puts opts
+ print "\n", description
+ exit 255
+ end
+ end
+
+ def log
+ @logger ||= create_logger
+ end
+
+ def error(message)
+ log.error(message)
+ STDERR.puts(message)
+ end
+
+ def local_data_path(file)
+ # Return the path to 'file' relative to the currently running program.
+ # Used as a simple mechanism of finding local data files.
+ script_dir = File.dirname(__FILE__)
+ File.join(script_dir, file)
+ end
+
+ def write_lorry(file, lorry)
+ format_options = { :indent => ' ' }
+ file.puts(JSON.pretty_generate(lorry, format_options))
+ end
+
+ def write_morph(file, morph)
+ file.write(YAML.dump(morph))
+ end
+
+ def create_logger
+ # Use the logger that was passed in from the 'main' import process, if
+ # detected.
+ log_fd = ENV['MORPH_LOG_FD']
+ if log_fd
+ log_stream = IO.new(Integer(log_fd), 'w')
+ logger = Logger.new(log_stream)
+ logger.level = Logger::DEBUG
+ logger.formatter = proc { |severity, datetime, progname, msg| "#{msg}\n" }
+ else
+ logger = Logger.new('/dev/null')
+ end
+ logger
+ end
+ end
+end
diff --git a/main.py b/main.py
new file mode 100644
index 0000000..b5ebece
--- /dev/null
+++ b/main.py
@@ -0,0 +1,920 @@
+#!/usr/bin/python
+# Import foreign packaging systems into Baserock
+#
+# Copyright (C) 2014 Codethink Limited
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License along
+# with this program; if not, write to the Free Software Foundation, Inc.,
+# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+
+import ansicolor
+import cliapp
+import morphlib
+import networkx
+import six
+
+import contextlib
+import copy
+import json
+import logging
+import os
+import pipes
+import sys
+import tempfile
+import time
+
+from logging import debug
+
+
+class LorrySet(object):
+ '''Manages a set of .lorry files.
+
+ The structure of .lorry files makes the code a little more confusing than
+ I would like. A lorry "entry" is a dict of one entry mapping name to info.
+ A lorry "file" is a dict of one or more of these entries merged together.
+ If it were a list of entries with 'name' fields, the code would be neater.
+
+ '''
+ def __init__(self, lorries_path):
+ self.path = lorries_path
+
+ if os.path.exists(lorries_path):
+ self.data = self.parse_all_lorries()
+ else:
+ os.makedirs(lorries_path)
+ self.data = {}
+
+ def all_lorry_files(self):
+ for dirpath, dirnames, filenames in os.walk(self.path):
+ for filename in filenames:
+ if filename.endswith('.lorry'):
+ yield os.path.join(dirpath, filename)
+
+ def parse_all_lorries(self):
+ lorry_set = {}
+ for lorry_file in self.all_lorry_files():
+ lorry = self.parse_lorry(lorry_file)
+
+ lorry_items = lorry.items()
+
+ for key, value in lorry_items:
+ if key in lorry_set:
+ raise Exception(
+ '%s: duplicates existing lorry %s' % (lorry_file, key))
+
+ lorry_set.update(lorry_items)
+
+ return lorry_set
+
+ def parse_lorry(self, lorry_file):
+ try:
+ with open(lorry_file, 'r') as f:
+ lorry = json.load(f)
+ return lorry
+ except ValueError as e:
+ raise cliapp.AppException(
+ "Error parsing %s: %s" % (lorry_file, e))
+
+ def get_lorry(self, name):
+ return {name: self.data[name]}
+
+ def find_lorry_for_package(self, kind, package_name):
+ key = 'x-products-%s' % kind
+ for name, lorry in self.data.iteritems():
+ products = lorry.get(key, [])
+ for entry in products:
+ if entry == package_name:
+ return {name: lorry}
+
+ return None
+
+ def _check_for_conflicts_in_standard_fields(self, existing, new):
+ '''Ensure that two lorries for the same project do actually match.'''
+ for field, value in existing.iteritems():
+ if field.startswith('x-'):
+ continue
+ if field == 'url':
+ # FIXME: need a much better way of detecting whether the URLs
+ # are equivalent ... right now HTTP vs. HTTPS will cause an
+ # error, for example!
+ matches = (value.rstrip('/') == new[field].rstrip('/'))
+ else:
+ matches = (value == new[field])
+ if not matches:
+ raise Exception(
+ 'Lorry %s conflicts with existing entry %s at field %s' %
+ (new, existing, field))
+
+ def _merge_products_fields(self, existing, new):
+ '''Merge the x-products- fields from new lorry into an existing one.'''
+ is_product_field = lambda x: x.startswith('x-products-')
+
+ existing_fields = [f for f in existing.iterkeys() if
+ is_product_field(f)]
+ new_fields = [f for f in new.iterkeys() if f not in existing_fields and
+ is_product_field(f)]
+
+ for field in existing_fields:
+ existing[field].extend(new[field])
+ existing[field] = list(set(existing[field]))
+
+ for field in new_fields:
+ existing[field] = new[field]
+
+ def add(self, filename, lorry_entry):
+ logging.debug('Adding %s to lorryset', filename)
+
+ filename = os.path.join(self.path, '%s.lorry' % filename)
+
+ assert len(lorry_entry) == 1
+
+ project_name = lorry_entry.keys()[0]
+ info = lorry_entry.values()[0]
+
+ if len(project_name) == 0:
+ raise cliapp.AppException(
+ 'Invalid lorry %s: %s' % (filename, lorry_entry))
+
+ if not isinstance(info.get('url'), six.string_types):
+ raise cliapp.AppException(
+ 'Invalid URL in lorry %s: %s' % (filename, info.get('url')))
+
+ if project_name in self.data:
+ stored_lorry = self.get_lorry(project_name)
+
+ self._check_for_conflicts_in_standard_fields(
+ stored_lorry[project_name], lorry_entry[project_name])
+ self._merge_products_fields(
+ stored_lorry[project_name], lorry_entry[project_name])
+ lorry_entry = stored_lorry
+ else:
+ self.data[project_name] = info
+
+ self._add_lorry_entry_to_lorry_file(filename, lorry_entry)
+
+ def _add_lorry_entry_to_lorry_file(self, filename, entry):
+ if os.path.exists(filename):
+ with open(filename) as f:
+ contents = json.load(f)
+ else:
+ contents = {}
+
+ contents.update(entry)
+
+ with morphlib.savefile.SaveFile(filename, 'w') as f:
+ json.dump(contents, f, indent=4, separators=(',', ': '),
+ sort_keys=True)
+
+
+class MorphologySet(morphlib.morphset.MorphologySet):
+ def __init__(self, path):
+ super(MorphologySet, self).__init__()
+
+ self.path = path
+ self.loader = morphlib.morphloader.MorphologyLoader()
+
+ if os.path.exists(path):
+ self.load_all_morphologies()
+ else:
+ os.makedirs(path)
+
+ def load_all_morphologies(self):
+ logging.info('Loading all .morph files under %s', self.path)
+
+ class FakeGitDir(morphlib.gitdir.GitDirectory):
+ '''Ugh
+
+ This is here because the default constructor will search up the
+ directory heirarchy until it finds a '.git' directory, but that
+ may be totally the wrong place for our purpose: we don't have a
+ Git directory at all.
+
+ '''
+ def __init__(self, path):
+ self.dirname = path
+ self._config = {}
+
+ gitdir = FakeGitDir(self.path)
+ finder = morphlib.morphologyfinder.MorphologyFinder(gitdir)
+ for filename in (f for f in finder.list_morphologies()
+ if not gitdir.is_symlink(f)):
+ text = finder.read_morphology(filename)
+ morph = self.loader.load_from_string(text, filename=filename)
+ morph.repo_url = None # self.root_repository_url
+ morph.ref = None # self.system_branch_name
+ self.add_morphology(morph)
+
+ def get_morphology(self, repo_url, ref, filename):
+ return self._get_morphology(repo_url, ref, filename)
+
+ def save_morphology(self, filename, morphology):
+ self.add_morphology(morphology)
+ morphology_to_save = copy.copy(morphology)
+ self.loader.unset_defaults(morphology_to_save)
+ filename = os.path.join(self.path, filename)
+ self.loader.save_to_file(filename, morphology_to_save)
+
+
+class GitDirectory(morphlib.gitdir.GitDirectory):
+ def __init__(self, dirname):
+ super(GitDirectory, self).__init__(dirname)
+
+ # Work around strange/unintentional behaviour in GitDirectory class
+ # when 'repopath' isn't a Git repo. If 'repopath' is contained
+ # within a Git repo then the GitDirectory will traverse up to the
+ # parent repo, which isn't what we want in this case.
+ if self.dirname != dirname:
+ logging.error(
+ 'Got git directory %s for %s!', self.dirname, dirname)
+ raise cliapp.AppException(
+ '%s is not the root of a Git repository' % dirname)
+
+ def has_ref(self, ref):
+ try:
+ self._rev_parse(ref)
+ return True
+ except morphlib.gitdir.InvalidRefError:
+ return False
+
+
+class BaserockImportException(cliapp.AppException):
+ pass
+
+
+class Package(object):
+ '''A package in the processing queue.
+
+ In order to provide helpful errors, this item keeps track of what
+ packages depend on it, and hence of why it was added to the queue.
+
+ '''
+ def __init__(self, kind, name, version):
+ self.kind = kind
+ self.name = name
+ self.version = version
+ self.required_by = []
+ self.morphology = None
+ self.is_build_dep = False
+ self.version_in_use = version
+
+ def __cmp__(self, other):
+ return cmp(self.name, other.name)
+
+ def __repr__(self):
+ return '<Package %s-%s>' % (self.name, self.version)
+
+ def __str__(self):
+ if len(self.required_by) > 0:
+ required_msg = ', '.join(self.required_by)
+ required_msg = ', required by: ' + required_msg
+ else:
+ required_msg = ''
+ return '%s-%s%s' % (self.name, self.version, required_msg)
+
+ def add_required_by(self, item):
+ self.required_by.append('%s-%s' % (item.name, item.version))
+
+ def match(self, name, version):
+ return (self.name==name and self.version==version)
+
+ def set_morphology(self, morphology):
+ self.morphology = morphology
+
+ def set_is_build_dep(self, is_build_dep):
+ self.is_build_dep = is_build_dep
+
+ def set_version_in_use(self, version_in_use):
+ self.version_in_use = version_in_use
+
+
+def find(iterable, match):
+ return next((x for x in iterable if match(x)), None)
+
+
+def run_extension(filename, args, cwd='.'):
+ output = []
+ errors = []
+
+ ext_logger = logging.getLogger(filename)
+
+ def report_extension_stdout(line):
+ output.append(line)
+
+ def report_extension_stderr(line):
+ errors.append(line)
+
+ def report_extension_logger(line):
+ ext_logger.debug(line)
+
+ ext = morphlib.extensions.ExtensionSubprocess(
+ report_stdout=report_extension_stdout,
+ report_stderr=report_extension_stderr,
+ report_logger=report_extension_logger,
+ )
+
+ # There are better ways of doing this, but it works for now.
+ main_path = os.path.dirname(os.path.realpath(__file__))
+ extension_path = os.path.join(main_path, filename)
+
+ logging.debug("Running %s %s with cwd %s" % (extension_path, args, cwd))
+ returncode = ext.run(extension_path, args, cwd, os.environ)
+
+ if returncode == 0:
+ ext_logger.info('succeeded')
+ else:
+ for line in errors:
+ ext_logger.error(line)
+ message = '%s failed with code %s: %s' % (
+ filename, returncode, '\n'.join(errors))
+ raise BaserockImportException(message)
+
+ return '\n'.join(output)
+
+
+class ImportLoop(object):
+ '''Import a package and all of its dependencies into Baserock.
+
+ This class holds the state for the processing loop.
+
+ '''
+
+ def __init__(self, app, goal_kind, goal_name, goal_version, extra_args=[]):
+ self.app = app
+ self.goal_kind = goal_kind
+ self.goal_name = goal_name
+ self.goal_version = goal_version
+ self.extra_args = extra_args
+
+ self.lorry_set = LorrySet(self.app.settings['lorries-dir'])
+ self.morph_set = MorphologySet(self.app.settings['definitions-dir'])
+
+ self.morphloader = morphlib.morphloader.MorphologyLoader()
+
+ self.importers = {}
+
+ def enable_importer(self, kind, extra_args=[]):
+ assert kind not in self.importers
+ self.importers[kind] = {
+ 'extra_args': extra_args
+ }
+
+ def run(self):
+ '''Process the goal package and all of its dependencies.'''
+ start_time = time.time()
+ start_displaytime = time.strftime('%x %X %Z', time.localtime())
+
+ self.app.status(
+ '%s: Import of %s %s started', start_displaytime, self.goal_kind,
+ self.goal_name)
+
+ if not self.app.settings['update-existing']:
+ self.app.status(
+ 'Not updating existing Git checkouts or existing definitions')
+
+ chunk_dir = os.path.join(self.morph_set.path, 'strata', self.goal_name)
+ if not os.path.exists(chunk_dir):
+ os.makedirs(chunk_dir)
+
+ goal = Package(self.goal_kind, self.goal_name, self.goal_version)
+ to_process = [goal]
+ processed = networkx.DiGraph()
+
+ errors = {}
+
+ while len(to_process) > 0:
+ current_item = to_process.pop()
+
+ try:
+ self._process_package(current_item)
+ error = False
+ except BaserockImportException as e:
+ self.app.status(str(e), error=True)
+ errors[current_item] = e
+ error = True
+
+ processed.add_node(current_item)
+
+ if not error:
+ self._process_dependencies_from_morphology(
+ current_item, current_item.morphology, to_process,
+ processed)
+
+ if len(errors) > 0:
+ self.app.status(
+ '\nErrors encountered, not generating a stratum morphology.')
+ self.app.status(
+ 'See the README files for guidance.')
+ else:
+ self._generate_stratum_morph_if_none_exists(
+ processed, self.goal_name)
+
+ duration = time.time() - start_time
+ end_displaytime = time.strftime('%x %X %Z', time.localtime())
+
+ self.app.status(
+ '%s: Import of %s %s ended (took %i seconds)', end_displaytime,
+ self.goal_kind, self.goal_name, duration)
+
+ def _process_package(self, package):
+ kind = package.kind
+ name = package.name
+ version = package.version
+
+ lorry = self._find_or_create_lorry_file(kind, name)
+ source_repo, url = self._fetch_or_update_source(lorry)
+
+ checked_out_version, ref = self._checkout_source_version(
+ source_repo, name, version)
+ package.set_version_in_use(checked_out_version)
+
+ chunk_morph = self._find_or_create_chunk_morph(
+ kind, name, checked_out_version, source_repo, url, ref)
+
+ if self.app.settings['use-local-sources']:
+ chunk_morph.repo_url = 'file://' + source_repo.dirname
+ else:
+ reponame = lorry.keys()[0]
+ chunk_morph.repo_url = 'upstream:%s' % reponame
+
+ package.set_morphology(chunk_morph)
+
+ def _process_dependencies_from_morphology(self, current_item, morphology,
+ to_process, processed):
+ '''Enqueue all dependencies of a package that are yet to be processed.
+
+ Dependencies are communicated using extra fields in morphologies,
+ currently.
+
+ '''
+ for key, value in morphology.iteritems():
+ if key.startswith('x-build-dependencies-'):
+ kind = key[len('x-build-dependencies-'):]
+ is_build_deps = True
+ elif key.startswith('x-runtime-dependencies-'):
+ kind = key[len('x-runtime-dependencies-'):]
+ is_build_deps = False
+ else:
+ continue
+
+ # We need to validate this field because it doesn't go through the
+ # normal MorphologyFactory validation, being an extension.
+ if not hasattr(value, 'iteritems'):
+ value_type = type(value).__name__
+ raise cliapp.AppException(
+ "Morphology for %s has invalid '%s': should be a dict, but "
+ "got a %s." % (morphology['name'], key, value_type))
+
+ self._process_dependency_list(
+ current_item, kind, value, to_process, processed, is_build_deps)
+
+ def _process_dependency_list(self, current_item, kind, deps, to_process,
+ processed, these_are_build_deps):
+ # All deps are added as nodes to the 'processed' graph. Runtime
+ # dependencies only need to appear in the stratum, but build
+ # dependencies have ordering constraints, so we add edges in
+ # the graph for build-dependencies too.
+
+ for dep_name, dep_version in deps.iteritems():
+ dep_package = find(
+ processed, lambda i: i.match(dep_name, dep_version))
+
+ if dep_package is None:
+ # Not yet processed
+ queue_item = find(
+ to_process, lambda i: i.match(dep_name, dep_version))
+ if queue_item is None:
+ queue_item = Package(kind, dep_name, dep_version)
+ to_process.append(queue_item)
+ dep_package = queue_item
+
+ dep_package.add_required_by(current_item)
+
+ if these_are_build_deps or current_item.is_build_dep:
+ # A runtime dep of a build dep becomes a build dep
+ # itself.
+ dep_package.set_is_build_dep(True)
+ processed.add_edge(dep_package, current_item)
+
+ def _find_or_create_lorry_file(self, kind, name):
+ # Note that the lorry file may already exist for 'name', but lorry
+ # files are named for project name rather than package name. In this
+ # case we will generate the lorry, and try to add it to the set, at
+ # which point LorrySet will notice the existing one and merge the two.
+ lorry = self.lorry_set.find_lorry_for_package(kind, name)
+
+ if lorry is None:
+ lorry = self._generate_lorry_for_package(kind, name)
+
+ if len(lorry) != 1:
+ raise Exception(
+ 'Expected generated lorry file with one entry.')
+
+ lorry_filename = lorry.keys()[0]
+
+ if '/' in lorry_filename:
+ # We try to be a bit clever and guess that if there's a prefix
+ # in the name, e.g. 'ruby-gems/chef' then it should go in a
+ # mega-lorry file, such as ruby-gems.lorry.
+ parts = lorry_filename.split('/', 1)
+ lorry_filename = parts[0]
+
+ if lorry_filename == '':
+ raise cliapp.AppException(
+ 'Invalid lorry data for %s: %s' % (name, lorry))
+
+ self.lorry_set.add(lorry_filename, lorry)
+ else:
+ lorry_filename = lorry.keys()[0]
+ logging.info(
+ 'Found existing lorry file for %s: %s', name, lorry_filename)
+
+ return lorry
+
+ def _generate_lorry_for_package(self, kind, name):
+ tool = '%s.to_lorry' % kind
+ if kind not in self.importers:
+ raise Exception('Importer for %s was not enabled.' % kind)
+ extra_args = self.importers[kind]['extra_args']
+ self.app.status('Calling %s to generate lorry for %s', tool, name)
+ lorry_text = run_extension(tool, extra_args + [name])
+ try:
+ lorry = json.loads(lorry_text)
+ except ValueError as e:
+ raise cliapp.AppException(
+ 'Invalid output from %s: %s' % (tool, lorry_text))
+ return lorry
+
+ def _run_lorry(self, lorry):
+ f = tempfile.NamedTemporaryFile(delete=False)
+ try:
+ logging.debug(json.dumps(lorry))
+ json.dump(lorry, f)
+ f.close()
+ cliapp.runcmd([
+ 'lorry', '--working-area',
+ self.app.settings['lorry-working-dir'], '--pull-only',
+ '--bundle', 'never', '--tarball', 'never', f.name])
+ finally:
+ os.unlink(f.name)
+
+ def _fetch_or_update_source(self, lorry):
+ assert len(lorry) == 1
+ lorry_name, lorry_entry = lorry.items()[0]
+
+ url = lorry_entry['url']
+ reponame = '_'.join(lorry_name.split('/'))
+ repopath = os.path.join(
+ self.app.settings['lorry-working-dir'], reponame, 'git')
+
+ checkoutpath = os.path.join(
+ self.app.settings['checkouts-dir'], reponame)
+
+ try:
+ already_lorried = os.path.exists(repopath)
+ if already_lorried:
+ if self.app.settings['update-existing']:
+ self.app.status('Updating lorry of %s', url)
+ self._run_lorry(lorry)
+ else:
+ self.app.status('Lorrying %s', url)
+ self._run_lorry(lorry)
+
+ if os.path.exists(checkoutpath):
+ repo = GitDirectory(checkoutpath)
+ repo.update_remotes()
+ else:
+ if already_lorried:
+ logging.warning(
+ 'Expected %s to exist, but will recreate it',
+ checkoutpath)
+ cliapp.runcmd(['git', 'clone', repopath, checkoutpath])
+ repo = GitDirectory(checkoutpath)
+ except cliapp.AppException as e:
+ raise BaserockImportException(e.msg.rstrip())
+
+ return repo, url
+
+ def _checkout_source_version(self, source_repo, name, version):
+ # FIXME: we need to be a bit smarter than this. Right now we assume
+ # that 'version' is a valid Git ref.
+
+ possible_names = [
+ version,
+ 'v%s' % version,
+ '%s-%s' % (name, version)
+ ]
+
+ for tag_name in possible_names:
+ if source_repo.has_ref(tag_name):
+ source_repo.checkout(tag_name)
+ ref = tag_name
+ break
+ else:
+ if self.app.settings['use-master-if-no-tag']:
+ logging.warning(
+ "Couldn't find tag %s in repo %s. Using 'master'.",
+ tag_name, source_repo)
+ source_repo.checkout('master')
+ ref = version = 'master'
+ else:
+ raise BaserockImportException(
+ 'Could not find ref for %s version %s.' % (name, version))
+
+ return version, ref
+
+ def _find_or_create_chunk_morph(self, kind, name, version, source_repo,
+ repo_url, named_ref):
+ morphology_filename = 'strata/%s/%s-%s.morph' % (
+ self.goal_name, name, version)
+ sha1 = source_repo.resolve_ref_to_commit(named_ref)
+
+ def generate_morphology():
+ morphology = self._generate_chunk_morph_for_package(
+ source_repo, kind, name, version, morphology_filename)
+ self.morph_set.save_morphology(morphology_filename, morphology)
+ return morphology
+
+ if self.app.settings['update-existing']:
+ morphology = generate_morphology()
+ else:
+ morphology = self.morph_set.get_morphology(
+ repo_url, sha1, morphology_filename)
+
+ if morphology is None:
+ # Existing chunk morphologies loaded from disk don't contain
+ # the repo and ref information. That's stored in the stratum
+ # morph. So the first time we touch a chunk morph we need to
+ # set this info.
+ logging.debug("Didn't find morphology for %s|%s|%s", repo_url,
+ sha1, morphology_filename)
+ morphology = self.morph_set.get_morphology(
+ None, None, morphology_filename)
+
+ if morphology is None:
+ logging.debug("Didn't find morphology for None|None|%s",
+ morphology_filename)
+ morphology = generate_morphology()
+
+ morphology.repo_url = repo_url
+ morphology.ref = sha1
+ morphology.named_ref = named_ref
+
+ return morphology
+
+ def _generate_chunk_morph_for_package(self, source_repo, kind, name,
+ version, filename):
+ tool = '%s.to_chunk' % kind
+
+ if kind not in self.importers:
+ raise Exception('Importer for %s was not enabled.' % kind)
+ extra_args = self.importers[kind]['extra_args']
+
+ self.app.status(
+ 'Calling %s to generate chunk morph for %s %s', tool, name,
+ version)
+
+ args = extra_args + [source_repo.dirname, name]
+ if version != 'master':
+ args.append(version)
+ text = run_extension(tool, args)
+
+ return self.morphloader.load_from_string(text, filename)
+
+ def _sort_chunks_by_build_order(self, graph):
+ order = reversed(sorted(graph.nodes()))
+ try:
+ return networkx.topological_sort(graph, nbunch=order)
+ except networkx.NetworkXUnfeasible as e:
+ # Cycle detected!
+ loop_subgraphs = networkx.strongly_connected_component_subgraphs(
+ graph, copy=False)
+ all_loops_str = []
+ for graph in loop_subgraphs:
+ if graph.number_of_nodes() > 1:
+ loops_str = '->'.join(str(node) for node in graph.nodes())
+ all_loops_str.append(loops_str)
+ raise cliapp.AppException(
+ 'One or more cycles detected in build graph: %s' %
+ (', '.join(all_loops_str)))
+
+ def _generate_stratum_morph_if_none_exists(self, graph, goal_name):
+ filename = os.path.join(
+ self.app.settings['definitions-dir'], 'strata', '%s.morph' %
+ goal_name)
+
+ if os.path.exists(filename) and not self.app.settings['update-existing']:
+ self.app.status(
+ msg='Found stratum morph for %s at %s, not overwriting' %
+ (goal_name, filename))
+ return
+
+ self.app.status(msg='Generating stratum morph for %s' % goal_name)
+
+ chunk_entries = []
+
+ for package in self._sort_chunks_by_build_order(graph):
+ m = package.morphology
+ if m is None:
+ raise cliapp.AppException('No morphology for %s' % package)
+
+ def format_build_dep(name, version):
+ dep_package = find(graph, lambda p: p.match(name, version))
+ return '%s-%s' % (name, dep_package.version_in_use)
+
+ build_depends = [
+ format_build_dep(name, version) for name, version in
+ m['x-build-dependencies-rubygems'].iteritems()
+ ]
+
+ entry = {
+ 'name': m['name'],
+ 'repo': m.repo_url,
+ 'ref': m.ref,
+ 'unpetrify-ref': m.named_ref,
+ 'morph': m.filename,
+ 'build-depends': build_depends,
+ }
+ chunk_entries.append(entry)
+
+ stratum_name = goal_name
+ stratum = {
+ 'name': stratum_name,
+ 'kind': 'stratum',
+ 'description': 'Autogenerated by Baserock import tool',
+ 'build-depends': [
+ {'morph': 'strata/ruby.morph'}
+ ],
+ 'chunks': chunk_entries,
+ }
+
+ morphology = self.morphloader.load_from_string(
+ json.dumps(stratum), filename=filename)
+ self.morphloader.unset_defaults(morphology)
+ self.morphloader.save_to_file(filename, morphology)
+
+
+class BaserockImportApplication(cliapp.Application):
+ def add_settings(self):
+ self.settings.string(['lorries-dir'],
+ "location for Lorry files",
+ metavar="PATH",
+ default=os.path.abspath('./lorries'))
+ self.settings.string(['definitions-dir'],
+ "location for morphology files",
+ metavar="PATH",
+ default=os.path.abspath('./definitions'))
+ self.settings.string(['checkouts-dir'],
+ "location for Git checkouts",
+ metavar="PATH",
+ default=os.path.abspath('./checkouts'))
+ self.settings.string(['lorry-working-dir'],
+ "Lorry working directory",
+ metavar="PATH",
+ default=os.path.abspath('./lorry-working-dir'))
+
+ self.settings.boolean(['update-existing'],
+ "update all the checked-out Git trees and "
+ "generated definitions",
+ default=False)
+ self.settings.boolean(['use-local-sources'],
+ "use file:/// URLs in the stratum 'repo' "
+ "fields, instead of upstream: URLs",
+ default=False)
+ self.settings.boolean(['use-master-if-no-tag'],
+ "if the correct tag for a version can't be "
+ "found, use 'master' instead of raising an "
+ "error",
+ default=False)
+
+ def _stream_has_colours(self, stream):
+ # http://blog.mathieu-leplatre.info/colored-output-in-console-with-python.html
+ if not hasattr(stream, "isatty"):
+ return False
+ if not stream.isatty():
+ return False # auto color only on TTYs
+ try:
+ import curses
+ curses.setupterm()
+ return curses.tigetnum("colors") > 2
+ except:
+ # guess false in case of error
+ return False
+
+ def setup(self):
+ self.add_subcommand('omnibus', self.import_omnibus,
+ arg_synopsis='REPO PROJECT_NAME SOFTWARE_NAME')
+ self.add_subcommand('rubygems', self.import_rubygems,
+ arg_synopsis='GEM_NAME')
+
+ self.stdout_has_colours = self._stream_has_colours(sys.stdout)
+
+ def setup_logging_formatter_for_file(self):
+ root_logger = logging.getLogger()
+ root_logger.name = 'main'
+
+ # You need recent cliapp for this to work, with commit "Split logging
+ # setup into further overrideable methods".
+ return logging.Formatter("%(name)s: %(levelname)s: %(message)s")
+
+ def process_args(self, args):
+ if len(args) == 0:
+ # Cliapp default is to just say "ERROR: must give subcommand" if
+ # no args are passed, I prefer this.
+ args = ['help']
+
+ super(BaserockImportApplication, self).process_args(args)
+
+ def status(self, msg, *args, **kwargs):
+ text = msg % args
+ if kwargs.get('error') == True:
+ logging.error(text)
+ if self.stdout_has_colours:
+ sys.stdout.write(ansicolor.red(text))
+ else:
+ sys.stdout.write(text)
+ else:
+ logging.info(text)
+ sys.stdout.write(text)
+ sys.stdout.write('\n')
+
+ def import_omnibus(self, args):
+ '''Import a software component from an Omnibus project.
+
+ Omnibus is a tool for generating application bundles for various
+ platforms. See <https://github.com/opscode/omnibus> for more
+ information.
+
+ '''
+ if len(args) != 3:
+ raise cliapp.AppException(
+ 'Please give the location of the Omnibus definitions repo, '
+ 'and the name of the project and the top-level software '
+ 'component.')
+
+ def running_inside_bundler():
+ return 'BUNDLE_GEMFILE' in os.environ
+
+ def command_to_run_python_in_directory(directory, args):
+ # Bundler requires that we run it from the Omnibus project
+ # directory. That messes up any relative paths the user may have
+ # passed on the commandline, so we do a bit of a hack to change
+ # back to the original directory inside the `bundle exec` process.
+ subshell_command = "(cd %s; exec python %s)" % \
+ (pipes.quote(directory), ' '.join(map(pipes.quote, args)))
+ shell_command = "sh -c %s" % pipes.quote(subshell_command)
+ return shell_command
+
+ def reexecute_self_with_bundler(path):
+ script = sys.argv[0]
+
+ logging.info('Reexecuting %s within Bundler, so that extensions '
+ 'use the correct dependencies for Omnibus and the '
+ 'Omnibus project definitions.', script)
+ command = command_to_run_python_in_directory(os.getcwd(), sys.argv)
+
+ logging.debug('Running: `bundle exec %s` in dir %s', command, path)
+ os.chdir(path)
+ os.execvp('bundle', [script, 'exec', command])
+
+ # Omnibus definitions are spread across multiple repos, and there is
+ # no stability guarantee for the definition format. The official advice
+ # is to use Bundler to execute Omnibus, so let's do that.
+ if not running_inside_bundler():
+ reexecute_self_with_bundler(args[0])
+
+ definitions_dir = args[0]
+ project_name = args[1]
+
+ loop = ImportLoop(
+ app=self,
+ goal_kind='omnibus', goal_name=args[2], goal_version='master')
+ loop.enable_importer('omnibus',
+ extra_args=[definitions_dir, project_name])
+ loop.enable_importer('rubygems')
+ loop.run()
+
+ def import_rubygems(self, args):
+ '''Import one or more RubyGems.'''
+ if len(args) != 1:
+ raise cliapp.AppException(
+ 'Please pass the name of a RubyGem on the commandline.')
+
+ loop = ImportLoop(
+ app=self,
+ goal_kind='rubygems', goal_name=args[0], goal_version='master')
+ loop.enable_importer('rubygems')
+ loop.run()
+
+
+app = BaserockImportApplication(progname='import')
+app.run()
diff --git a/omnibus.to_chunk b/omnibus.to_chunk
new file mode 100755
index 0000000..1189199
--- /dev/null
+++ b/omnibus.to_chunk
@@ -0,0 +1,274 @@
+#!/usr/bin/env ruby
+#
+# Create a chunk morphology to integrate Omnibus software in Baserock
+#
+# Copyright (C) 2014 Codethink Limited
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License along
+# with this program; if not, write to the Free Software Foundation, Inc.,
+# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+require 'bundler'
+require 'omnibus'
+
+require 'optparse'
+require 'rubygems/commands/build_command'
+require 'rubygems/commands/install_command'
+require 'shellwords'
+
+require_relative 'importer_base'
+
+BANNER = "Usage: omnibus.to_chunk PROJECT_DIR PROJECT_NAME SOURCE_DIR SOFTWARE_NAME"
+
+DESCRIPTION = <<-END
+Generate a .morph file for a given Omnibus software component.
+END
+
+class Omnibus::Builder
+ # It's possible to use `gem install` in build commands, which is a great
+ # way of subverting the dependency tracking Omnibus provides. It's done
+ # in `omnibus-chef/config/software/chefdk.rb`, for example.
+ #
+ # To handle this, here we extend the class that executes the build commands
+ # to detect when `gem install` is run. It uses the Gem library to turn the
+ # commandline back into a Bundler::Dependency object that we can use.
+ #
+ # We also trap `gem build` so we know when a software component is a RubyGem
+ # that should be handled by 'rubygems.to_chunk'.
+
+ class GemBuildCommandParser < Gem::Commands::BuildCommand
+ def gemspec_path(args)
+ handle_options args
+ if options[:args].length != 1
+ raise Exception, "Invalid `gem build` commandline: 1 argument " +
+ "expected, got #{options[:args]}."
+ end
+ options[:args][0]
+ end
+ end
+
+ class GemInstallCommandParser < Gem::Commands::InstallCommand
+ def dependency_list_from_commandline(args)
+ handle_options args
+
+ # `gem install foo*` is sometimes used when installing a locally built
+ # Gem, to avoid needing to know the exact version number that was built.
+ # We only care about remote Gems being installed, so anything with a '*'
+ # in its name can be ignored.
+ gem_names = options[:args].delete_if { |name| name.include?('*') }
+
+ gem_names.collect do |gem_name|
+ Bundler::Dependency.new(gem_name, options[:version])
+ end
+ end
+ end
+
+ def gem(command, options = {})
+ # This function re-implements the 'gem' function in the build-commands DSL.
+ if command.start_with? 'build'
+ parser = GemBuildCommandParser.new
+ args = Shellwords.split(command).drop(1)
+ if built_gemspec != nil
+ raise Exception, "More than one `gem build` command was run as part " +
+ "of the build process. The 'rubygems.to_chunk' " +
+ "program currently supports only one .gemspec " +
+ "build per chunk, so this can't be processed " +
+ "automatically."
+ end
+ @built_gemspec = parser.gemspec_path(args)
+ elsif command.start_with? 'install'
+ parser = GemInstallCommandParser.new
+ args = Shellwords.split(command).drop(1)
+ args_without_build_flags = args.take_while { |item| item != '--' }
+ gems = parser.dependency_list_from_commandline(args_without_build_flags)
+ manually_installed_rubygems.concat gems
+ end
+ end
+
+ def built_gemspec
+ @built_gemspec
+ end
+
+ def manually_installed_rubygems
+ @manually_installed_rubygems ||= []
+ end
+end
+
+class OmnibusChunkMorphologyGenerator < Importer::Base
+ def initialize
+ local_data = YAML.load_file(local_data_path("omnibus.yaml"))
+ @dependency_blacklist = local_data['dependency-blacklist']
+ end
+
+ def parse_options(arguments)
+ opts = create_option_parser(BANNER, DESCRIPTION)
+
+ parsed_arguments = opts.parse!(arguments)
+
+ if parsed_arguments.length != 4 and parsed_arguments.length != 5
+ STDERR.puts "Expected 4 or 5 arguments, got #{parsed_arguments}."
+ opts.parse(['-?'])
+ exit 255
+ end
+
+ project_dir, project_name, source_dir, software_name, expected_version = \
+ parsed_arguments
+ # Not yet implemented
+ #if expected_version != nil
+ # expected_version = Gem::Version.new(expected_version)
+ #end
+ [project_dir, project_name, source_dir, software_name, expected_version]
+ end
+
+ class SubprocessError < RuntimeError
+ end
+
+ def run_tool_capture_output(tool_name, *args)
+ tool_path = local_data_path(tool_name)
+
+ # FIXME: something breaks when we try to share this FD, it's not
+ # ideal that the subprocess doesn't log anything, though.
+ env_changes = {'MORPH_LOG_FD' => nil}
+
+ command = [[tool_path, tool_name], *args]
+ log.info("Running #{command.join(' ')} in #{scripts_dir}")
+
+ text = IO.popen(
+ env_changes, command, :chdir => scripts_dir, :err => [:child, :out]
+ ) do |io|
+ io.read
+ end
+
+ if $? == 0
+ text
+ else
+ raise SubprocessError, text
+ end
+ end
+
+ def generate_chunk_morph_for_rubygems_software(software, source_dir)
+ # This is a better heuristic for getting the name of the Gem
+ # than the software name, it seems ...
+ gem_name = software.relative_path
+
+ text = run_tool_capture_output('rubygems.to_chunk', source_dir, gem_name)
+ log.debug("Text from output: #{text}, result #{$?}")
+
+ morphology = YAML::load(text)
+ return morphology
+ rescue SubprocessError => e
+ error "Tried to import #{software.name} as a RubyGem, got the " \
+ "following error from rubygems.to_chunk: #{e.message}"
+ exit 1
+ end
+
+ def resolve_rubygems_deps(requirements)
+ return {} if requirements.empty?
+
+ log.info('Resolving RubyGem requirements with Bundler')
+
+ fake_gemfile = Bundler::Dsl.new
+ fake_gemfile.source('https://rubygems.org')
+
+ requirements.each do |dep|
+ fake_gemfile.gem(dep.name, dep.requirement)
+ end
+
+ definition = fake_gemfile.to_definition('Gemfile.lock', true)
+ resolved_specs = definition.resolve_remotely!
+
+ Hash[resolved_specs.collect { |spec| [spec.name, spec.version.to_s]}]
+ end
+
+ def generate_chunk_morph_for_software(project, software, source_dir)
+ if software.builder.built_gemspec != nil
+ morphology = generate_chunk_morph_for_rubygems_software(software,
+ source_dir)
+ else
+ morphology = {
+ "name" => software.name,
+ "kind" => "chunk",
+ "description" => "Automatically generated by omnibus.to_chunk"
+ }
+ end
+
+ omnibus_deps = {}
+ rubygems_deps = {}
+
+ software.dependencies.each do |name|
+ software = Omnibus::Software.load(project, name)
+ if @dependency_blacklist.member? name
+ log.info(
+ "Not adding #{name} as a dependency as it is marked to be ignored.")
+ elsif software.fetcher.instance_of?(Omnibus::PathFetcher)
+ log.info(
+ "Not adding #{name} as a dependency: it's installed from " +
+ "a path which probably means that it is package configuration, not " +
+ "a 3rd-party component to be imported.")
+ elsif software.fetcher.instance_of?(Omnibus::NullFetcher)
+ if software.builder.built_gemspec
+ log.info(
+ "Adding #{name} as a RubyGem dependency because it builds " +
+ "#{software.builder.built_gemspec}")
+ rubygems_deps[name] = software.version
+ else
+ log.info(
+ "Not adding #{name} as a dependency: no sources listed.")
+ end
+ else
+ omnibus_deps[name] = software.version
+ end
+ end
+
+ gem_requirements = software.builder.manually_installed_rubygems
+ rubygems_deps = resolve_rubygems_deps(gem_requirements)
+
+ morphology.update({
+ # Possibly this tool should look at software.build and
+ # generate suitable configure, build and install-commands.
+ # For now: don't bother!
+
+ # FIXME: are these build or runtime dependencies? We'll assume both.
+ "x-build-dependencies-omnibus" => omnibus_deps,
+ "x-runtime-dependencies-omnibus" => omnibus_deps,
+
+ "x-build-dependencies-rubygems" => {},
+ "x-runtime-dependencies-rubygems" => rubygems_deps,
+ })
+
+ if software.description
+ morphology['description'] = software.description + '\n\n' +
+ morphology['description']
+ end
+
+ morphology
+ end
+
+ def run
+ project_dir, project_name, source_dir, software_name = parse_options(ARGV)
+
+ log.info("Creating chunk morph for #{software_name} from project " +
+ "#{project_name}, defined in #{project_dir}")
+
+ Dir.chdir(project_dir)
+
+ project = Omnibus::Project.load(project_name)
+
+ software = Omnibus::Software.load(@project, software_name)
+
+ morph = generate_chunk_morph_for_software(project, software, source_dir)
+
+ write_morph(STDOUT, morph)
+ end
+end
+
+OmnibusChunkMorphologyGenerator.new.run
diff --git a/omnibus.to_lorry b/omnibus.to_lorry
new file mode 100755
index 0000000..256f924
--- /dev/null
+++ b/omnibus.to_lorry
@@ -0,0 +1,94 @@
+#!/usr/bin/env ruby
+#
+# Create a Baserock .lorry file for a given Omnibus software component
+#
+# Copyright (C) 2014 Codethink Limited
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License along
+# with this program; if not, write to the Free Software Foundation, Inc.,
+# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+require 'bundler'
+require 'omnibus'
+
+require 'optparse'
+require 'rubygems/commands/install_command'
+require 'shellwords'
+
+require_relative 'importer_base'
+
+BANNER = "Usage: omnibus.to_lorry PROJECT_DIR PROJECT_NAME SOFTWARE_NAME"
+
+DESCRIPTION = <<-END
+Generate a .lorry file for a given Omnibus software component.
+END
+
+class OmnibusLorryGenerator < Importer::Base
+ def parse_options(arguments)
+ opts = create_option_parser(BANNER, DESCRIPTION)
+
+ parsed_arguments = opts.parse!(arguments)
+
+ if parsed_arguments.length != 3
+ STDERR.puts "Expected 3 arguments, got #{parsed_arguments}."
+ opts.parse(['-?'])
+ exit 255
+ end
+
+ project_dir, project_name, software_name = parsed_arguments
+ [project_dir, project_name, software_name]
+ end
+
+ def generate_lorry_for_software(software)
+ lorry_body = {
+ 'x-products-omnibus' => [software.name]
+ }
+
+ if software.source and software.source.member? :git
+ lorry_body.update({
+ 'type' => 'git',
+ 'url' => software.source[:git],
+ })
+ elsif software.source and software.source.member? :url
+ lorry_body.update({
+ 'type' => 'tarball',
+ 'url' => software.source[:url],
+ # lorry doesn't validate the checksum right now, but maybe it should.
+ 'x-md5' => software.source[:md5],
+ })
+ else
+ error "Couldn't generate lorry file from source '#{software.source.inspect}'"
+ exit 1
+ end
+
+ { software.name => lorry_body }
+ end
+
+ def run
+ project_dir, project_name, software_name = parse_options(ARGV)
+
+ log.info("Creating lorry for #{software_name} from project " +
+ "#{project_name}, defined in #{project_dir}")
+
+ Dir.chdir(project_dir)
+
+ project = Omnibus::Project.load(project_name)
+
+ software = Omnibus::Software.load(project, software_name)
+
+ lorry = generate_lorry_for_software(software)
+
+ write_lorry(STDOUT, lorry)
+ end
+end
+
+OmnibusLorryGenerator.new.run
diff --git a/omnibus.yaml b/omnibus.yaml
new file mode 100644
index 0000000..2116f2a
--- /dev/null
+++ b/omnibus.yaml
@@ -0,0 +1,7 @@
+---
+
+dependency-blacklist:
+ # This is provided as a single downloadable .pem file, which isn't something
+ # Lorry can understand. Also, it's provided by the 'ca-certificates' chunk in
+ # Baserock already.
+ - cacerts
diff --git a/rubygems.to_chunk b/rubygems.to_chunk
new file mode 100755
index 0000000..796fe89
--- /dev/null
+++ b/rubygems.to_chunk
@@ -0,0 +1,275 @@
+#!/usr/bin/env ruby
+#
+# Create a chunk morphology to integrate a RubyGem in Baserock
+#
+# Copyright (C) 2014 Codethink Limited
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License along
+# with this program; if not, write to the Free Software Foundation, Inc.,
+# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+require 'bundler'
+
+require_relative 'importer_base'
+
+class << Bundler
+ def default_gemfile
+ # This is a hack to make things not crash when there's no Gemfile
+ Pathname.new('.')
+ end
+end
+
+def spec_is_from_current_source_tree(spec, source_dir)
+ spec.source.instance_of? Bundler::Source::Path and
+ File.identical?(spec.source.path, source_dir)
+end
+
+BANNER = "Usage: rubygems.to_chunk SOURCE_DIR GEM_NAME [VERSION]"
+
+DESCRIPTION = <<-END
+This tool reads the Gemfile and optionally the Gemfile.lock from a Ruby project
+source tree in SOURCE_DIR. It outputs a chunk morphology for GEM_NAME on
+stdout. If VERSION is supplied, it is used to check that the build instructions
+will produce the expected version of the Gem.
+
+It is intended for use with the `baserock-import` tool.
+END
+
+class RubyGemChunkMorphologyGenerator < Importer::Base
+ def initialize
+ local_data = YAML.load_file(local_data_path("rubygems.yaml"))
+ @build_dependency_whitelist = local_data['build-dependency-whitelist']
+ end
+
+ def parse_options(arguments)
+ opts = create_option_parser(BANNER, DESCRIPTION)
+
+ parsed_arguments = opts.parse!(arguments)
+
+ if parsed_arguments.length != 2 && parsed_arguments.length != 3
+ STDERR.puts "Expected 2 or 3 arguments, got #{parsed_arguments}."
+ opts.parse(['-?'])
+ exit 255
+ end
+
+ source_dir, gem_name, expected_version = parsed_arguments
+ source_dir = File.absolute_path(source_dir)
+ if expected_version != nil
+ expected_version = Gem::Version.new(expected_version.dup)
+ end
+ [source_dir, gem_name, expected_version]
+ end
+
+ def load_local_gemspecs()
+ # Look for .gemspec files in the source repo.
+ #
+ # If there is no .gemspec, but you set 'name' and 'version' then
+ # inside Bundler::Source::Path.load_spec_files this call will create a
+ # fake gemspec matching that name and version. That's probably not useful.
+
+ dir = '.'
+
+ source = Bundler::Source::Path.new({
+ 'path' => dir,
+ })
+
+ log.info "Loaded #{source.specs.count} specs from source dir."
+ source.specs.each do |spec|
+ log.debug " * #{spec.inspect} #{spec.dependencies.inspect}"
+ end
+
+ source
+ end
+
+ def get_spec_for_gem(specs, gem_name)
+ found = specs[gem_name].select {|s| Gem::Platform.match(s.platform)}
+ if found.empty?
+ raise Exception,
+ "No Gemspecs found matching '#{gem_name}'"
+ elsif found.length != 1
+ raise Exception,
+ "Unsure which Gem to use for #{gem_name}, got #{found}"
+ end
+ found[0]
+ end
+
+ def chunk_name_for_gemspec(spec)
+ # Chunk names are the Gem's "full name" (name + version number), so
+ # that we don't break in the rare but possible case that two different
+ # versions of the same Gem are required for something to work. It'd be
+ # nicer to only use the full_name if we detect such a conflict.
+ spec.full_name
+ end
+
+ def is_signed_gem(spec)
+ spec.signing_key != nil
+ end
+
+ def generate_chunk_morph_for_gem(spec)
+ description = 'Automatically generated by rubygems.to_chunk'
+
+ bin_dir = "\"$DESTDIR/$PREFIX/bin\""
+ gem_dir = "\"$DESTDIR/$(gem environment home)\""
+
+ # There's more splitting to be done, but putting the docs in the
+ # correct artifact is the single biggest win for enabling smaller
+ # system images.
+ #
+ # Adding this to Morph's default ruleset is painful, because:
+ # - Changing the default split rules triggers a rebuild of everything.
+ # - The whole split rule code needs reworking to prevent overlaps and to
+ # make it possible to extend rules without creating overlaps. It's
+ # otherwise impossible to reason about.
+
+ split_rules = [
+ {
+ 'artifact' => "#{spec.full_name}-doc",
+ 'include' => [
+ 'usr/lib/ruby/gems/\d[\w.]*/doc/.*'
+ ]
+ }
+ ]
+
+ # It'd be rather tricky to include these build instructions as a
+ # BuildSystem implementation in Morph. The problem is that there's no
+ # way for the default commands to know what .gemspec file they should
+ # be building. It doesn't help that the .gemspec may be in a subdirectory
+ # (as in Rails, for example).
+ #
+ # Note that `gem help build` says the following:
+ #
+ # The best way to build a gem is to use a Rakefile and the
+ # Gem::PackageTask which ships with RubyGems.
+ #
+ # It's often possible to run `rake gem`, but this may require Hoe,
+ # rake-compiler, Jeweler or other assistance tools to be present at Gem
+ # construction time. It seems that many Ruby projects that use these tools
+ # also maintain an up-to-date generated .gemspec file, which means that we
+ # can get away with using `gem build` just fine in many cases.
+ #
+ # Were we to use `setup.rb install` or `rake install`, programs that loaded
+ # with the 'rubygems' library would complain that required Gems were not
+ # installed. We must have the Gem metadata available, and `gem build; gem
+ # install` seems the easiest way to achieve that.
+
+ configure_commands = []
+
+ if is_signed_gem(spec)
+ # This is a best-guess hack for allowing unsigned builds of Gems that are
+ # normally built signed. There's no value in building signed Gems when we
+ # control the build and deployment environment, and we obviously can't
+ # provide the private key of the Gem's maintainer.
+ configure_commands <<
+ "sed -e '/cert_chain\\s*=/d' -e '/signing_key\\s*=/d' -i " +
+ "#{spec.name}.gemspec"
+ end
+
+ build_commands = [
+ "gem build #{spec.name}.gemspec",
+ ]
+
+ install_commands = [
+ "mkdir -p #{gem_dir}",
+ "gem install --install-dir #{gem_dir} --bindir #{bin_dir} " +
+ "--ignore-dependencies --local ./#{spec.full_name}.gem"
+ ]
+
+ {
+ 'name' => chunk_name_for_gemspec(spec),
+ 'kind' => 'chunk',
+ 'description' => description,
+ 'build-system' => 'manual',
+ 'products' => split_rules,
+ 'configure-commands' => configure_commands,
+ 'build-commands' => build_commands,
+ 'install-commands' => install_commands,
+ }
+ end
+
+ def build_deps_for_gem(spec)
+ deps = spec.dependencies.select do |d|
+ d.type == :development && @build_dependency_whitelist.member?(d.name)
+ end
+ end
+
+ def runtime_deps_for_gem(spec)
+ spec.dependencies.select {|d| d.type == :runtime}
+ end
+
+ def run
+ source_dir_name, gem_name, expected_version = parse_options(ARGV)
+
+ log.info("Creating chunk morph for #{gem_name} based on " +
+ "source code in #{source_dir_name}")
+
+ Dir.chdir(source_dir_name)
+
+ # Instead of reading the real Gemfile, invent one that simply includes the
+ # chosen .gemspec. If present, the Gemfile.lock will be honoured.
+ fake_gemfile = Bundler::Dsl.new
+ fake_gemfile.source('https://rubygems.org')
+ begin
+ fake_gemfile.gemspec({:name => gem_name})
+ rescue Bundler::InvalidOption
+ error "Did not find #{gem_name}.gemspec in #{source_dir_name}"
+ exit 1
+ end
+
+ definition = fake_gemfile.to_definition('Gemfile.lock', true)
+ resolved_specs = definition.resolve_remotely!
+
+ spec = get_spec_for_gem(resolved_specs, gem_name)
+
+ if not spec_is_from_current_source_tree(spec, source_dir_name)
+ error "Specified gem '#{spec.name}' doesn't live in the source in " +
+ "'#{source_dir_name}'"
+ log.debug "SPEC: #{spec.inspect} #{spec.source}"
+ exit 1
+ end
+
+ if expected_version != nil && spec.version != expected_version
+ # This check is brought to you by Coderay, which changes its version
+ # number based on an environment variable. Other Gems may do this too.
+ error "Source in #{source_dir_name} produces #{spec.full_name}, but " +
+ "the expected version was #{expected_version}."
+ exit 1
+ end
+
+ morph = generate_chunk_morph_for_gem(spec)
+
+ # One might think that you could use the Bundler::Dependency.groups
+ # field to filter but it doesn't seem to be useful. Instead we go back to
+ # the Gem::Specification of the target Gem and use the dependencies fild
+ # there. We look up each dependency in the resolved_specset to find out
+ # what version Bundler has chosen of it.
+
+ def format_deps_for_morphology(specset, dep_list)
+ info = dep_list.collect do |dep|
+ spec = specset[dep][0]
+ [spec.name, spec.version.to_s]
+ end
+ Hash[info]
+ end
+
+ build_deps = format_deps_for_morphology(
+ resolved_specs, build_deps_for_gem(spec))
+ runtime_deps = format_deps_for_morphology(
+ resolved_specs, runtime_deps_for_gem(spec))
+
+ morph['x-build-dependencies-rubygems'] = build_deps
+ morph['x-runtime-dependencies-rubygems'] = runtime_deps
+
+ write_morph(STDOUT, morph)
+ end
+end
+
+RubyGemChunkMorphologyGenerator.new.run
diff --git a/rubygems.to_lorry b/rubygems.to_lorry
new file mode 100755
index 0000000..7a00820
--- /dev/null
+++ b/rubygems.to_lorry
@@ -0,0 +1,164 @@
+#!/usr/bin/python
+#
+# Create a Baserock .lorry file for a given RubyGem
+#
+# Copyright (C) 2014 Codethink Limited
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License along
+# with this program; if not, write to the Free Software Foundation, Inc.,
+# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+
+import requests
+import requests_cache
+import yaml
+
+import logging
+import json
+import os
+import sys
+import urlparse
+
+from importer_base import ImportException, ImportExtension
+
+
+class GenerateLorryException(ImportException):
+ pass
+
+
+class RubyGemsWebServiceClient(object):
+ def __init__(self):
+ # Save hammering the rubygems.org API: 'requests' API calls are
+ # transparently cached in an SQLite database, instead.
+ requests_cache.install_cache('rubygems_api_cache')
+
+ def _request(self, url):
+ r = requests.get(url)
+ if r.ok:
+ return json.loads(r.text)
+ else:
+ raise GenerateLorryException(
+ 'Request to %s failed: %s' % (r.url, r.reason))
+
+ def get_gem_info(self, gem_name):
+ info = self._request(
+ 'http://rubygems.org/api/v1/gems/%s.json' % gem_name)
+
+ if info['name'] != gem_name:
+ # Sanity check
+ raise GenerateLorryException(
+ 'Received info for Gem "%s", requested "%s"' % info['name'],
+ gem_name)
+
+ return info
+
+
+class RubyGemLorryGenerator(ImportExtension):
+ def __init__(self):
+ super(RubyGemLorryGenerator, self).__init__()
+
+ with open('rubygems.yaml', 'r') as f:
+ local_data = yaml.load(f.read())
+
+ self.lorry_prefix = local_data['lorry-prefix']
+ self.known_source_uris = local_data['known-source-uris']
+
+ logging.debug(
+ "Loaded %i known source URIs from local metadata.", len(self.known_source_uris))
+
+ def process_args(self, args):
+ if len(args) != 1:
+ raise ImportException(
+ 'Please call me with the name of a RubyGem as an argument.\n')
+
+ gem_name = args[0]
+
+ lorry = self.generate_lorry_for_gem(gem_name)
+ self.write_lorry(sys.stdout, lorry)
+
+ def find_upstream_repo_for_gem(self, gem_name, gem_info):
+ source_code_uri = gem_info['source_code_uri']
+
+ if gem_name in self.known_source_uris:
+ logging.debug('Found %s in known-source-uris', gem_name)
+ known_uri = self.known_source_uris[gem_name]
+ if source_code_uri is not None and known_uri != source_code_uri:
+ sys.stderr.write(
+ '%s: Hardcoded source URI %s doesn\'t match spec URI %s\n' %
+ (gem_name, known_uri, source_code_uri))
+ return known_uri
+
+ if source_code_uri is not None and len(source_code_uri) > 0:
+ logging.debug('Got source_code_uri %s', source_code_uri)
+ if source_code_uri.endswith('/tree'):
+ source_code_uri = source_code_uri[:-len('/tree')]
+
+ return source_code_uri
+
+ homepage_uri = gem_info['homepage_uri']
+ if homepage_uri is not None and len(homepage_uri) > 0:
+ logging.debug('Got homepage_uri %s', source_code_uri)
+ netloc = urlparse.urlsplit(homepage_uri)[1]
+ if netloc == 'github.com':
+ return homepage_uri
+
+ # Further possible leads on locating source code.
+ # http://ruby-toolbox.com/projects/$gemname -> sometimes contains an
+ # upstream link, even if the gem info does not.
+ # https://github.com/search?q=$gemname -> often the first result is
+ # the correct one, but you can never know.
+
+ raise GenerateLorryException(
+ "Gem metadata for '%s' does not point to its source code "
+ "repository." % gem_name)
+
+ def project_name_from_repo(self, repo_url):
+ if repo_url.endswith('/tree/master'):
+ repo_url = repo_url[:-len('/tree/master')]
+ if repo_url.endswith('/'):
+ repo_url = repo_url[:-1]
+ if repo_url.endswith('.git'):
+ repo_url = repo_url[:-len('.git')]
+ return os.path.basename(repo_url)
+
+ def generate_lorry_for_gem(self, gem_name):
+ rubygems_client = RubyGemsWebServiceClient()
+
+ gem_info = rubygems_client.get_gem_info(gem_name)
+
+ gem_source_url = self.find_upstream_repo_for_gem(gem_name, gem_info)
+ logging.info('Got URL <%s> for %s', gem_source_url, gem_name)
+
+ project_name = self.project_name_from_repo(gem_source_url)
+ lorry_name = self.lorry_prefix + project_name
+
+ # One repo may produce multiple Gems. It's up to the caller to merge
+ # multiple .lorry files that get generated for the same repo.
+
+ lorry = {
+ lorry_name: {
+ 'type': 'git',
+ 'url': gem_source_url,
+ 'x-products-rubygems': [gem_name]
+ }
+ }
+
+ return lorry
+
+ def write_lorry(self, stream, lorry):
+ json.dump(lorry, stream, indent=4)
+ # Needed so the morphlib.extensions code will pick up the last line.
+ stream.write('\n')
+
+
+if __name__ == '__main__':
+ RubyGemLorryGenerator().run()
diff --git a/rubygems.yaml b/rubygems.yaml
new file mode 100644
index 0000000..e1e6fcc
--- /dev/null
+++ b/rubygems.yaml
@@ -0,0 +1,49 @@
+---
+
+lorry-prefix: ruby-gems/
+
+# The :development dependency set is way too broad for our needs: for most Gems,
+# it includes test tools and development aids that aren't necessary for just
+# building the Gem. It's hard to even get a stratum if we include all these
+# tools because of the number of circular dependencies. Instead, only those
+# tools which are known to be required at Gem build time are listed as
+# build-dependencies, and any other :development dependencies are ignored.
+build-dependency-whitelist:
+ - hoe
+ # rake is bundled with Ruby, so it is not included in the whitelist.
+
+# The following Gems don't provide a source_code_uri in their Gem metadata.
+# Ideally ... they would do.
+known-source-uris:
+ appbundler: https://github.com/opscode/appbundler
+ ast: https://github.com/openSUSE/ast
+ brass: https://github.com/rubyworks/brass
+ coveralls: https://github.com/lemurheavy/coveralls-ruby
+ dep-selector-libgecode: https://github.com/opscode/dep-selector-libgecode
+ diff-lcs: https://github.com/halostatue/diff-lcs
+ erubis: https://github.com/kwatch/erubis
+ fog-brightbox: https://github.com/brightbox/fog-brightbox
+ highline: https://github.com/JEG2/highline
+ hoe: https://github.com/seattlerb/hoe
+ indexer: https://github.com/rubyworks/indexer
+ json: https://github.com/flori/json
+ method_source: https://github.com/banister/method_source
+ mixlib-authentication: https://github.com/opscode/mixlib-authentication
+ mixlib-cli: https://github.com/opscode/mixlib-cli
+ mixlib-log: https://github.com/opscode/mixlib-log
+ mixlib-shellout: http://github.com/opscode/mixlib-shellout
+ ohai: http://github.com/opscode/ohai
+ rack-cache: https://github.com/rtomayko/rack-cache
+ actionmailer: https://github.com/rails/rails
+ actionpack: https://github.com/rails/rails
+ actionview: https://github.com/rails/rails
+ activejob: https://github.com/rails/rails
+ activemodel: https://github.com/rails/rails
+ activerecord: https://github.com/rails/rails
+ activesupport: https://github.com/rails/rails
+ rails: https://github.com/rails/rails
+ railties: https://github.com/rails/rails
+ pg: https://github.com/ged/ruby-pg
+ sigar: https://github.com/hyperic/sigar
+ sprockets: https://github.com/sstephenson/sprockets
+ tins: https://github.com/flori/tins