diff options
author | RyanGlScott <ryan.gl.scott@gmail.com> | 2015-10-13 00:43:32 -0500 |
---|---|---|
committer | Austin Seipp <austin@well-typed.com> | 2015-10-13 00:43:42 -0500 |
commit | d2f9972a35ce05ceb8a78893e433ef1df06f73ef (patch) | |
tree | bc3cee0ec99aacd64298993fde590c017734c6e6 /libraries/ghc-boot | |
parent | 94ef79a766a1f58a5daadcf7dbb342812cd1a9bd (diff) | |
download | haskell-d2f9972a35ce05ceb8a78893e433ef1df06f73ef.tar.gz |
Make dataToQa aware of Data instances which use functions to implement toConstr
Trac #10796 exposes a way to make `template-haskell`'s `dataToQa` function
freak out if using a `Data` instance that produces a `Constr` (by means of
`toConstr`) using a function name instead of a data constructor name. While
such `Data` instances are somewhat questionable, they are nevertheless present
in popular libraries (e.g., `containers`), so we can at least make `dataToQa`
aware of their existence.
In order to properly distinguish strings which represent variables (as opposed
to data constructors), it was necessary to move functionality from `Lexeme` (in
`ghc`) to `GHC.Lexeme` in a new `ghc-boot` library (which was previously named
`bin-package-db`).
Reviewed By: goldfire, thomie
Differential Revision: https://phabricator.haskell.org/D1313
GHC Trac Issues: #10796
Diffstat (limited to 'libraries/ghc-boot')
-rw-r--r-- | libraries/ghc-boot/GHC/Lexeme.hs | 32 | ||||
-rw-r--r-- | libraries/ghc-boot/GHC/PackageDb.hs | 411 | ||||
-rw-r--r-- | libraries/ghc-boot/LICENSE | 31 | ||||
-rw-r--r-- | libraries/ghc-boot/ghc-boot.cabal | 45 |
4 files changed, 519 insertions, 0 deletions
diff --git a/libraries/ghc-boot/GHC/Lexeme.hs b/libraries/ghc-boot/GHC/Lexeme.hs new file mode 100644 index 0000000000..677c9a65e6 --- /dev/null +++ b/libraries/ghc-boot/GHC/Lexeme.hs @@ -0,0 +1,32 @@ +----------------------------------------------------------------------------- +-- | +-- Module : GHC.Lexeme +-- Copyright : (c) The GHC Team +-- +-- Maintainer : ghc-devs@haskell.org +-- Portability : portable +-- +-- Functions to evaluate whether or not a string is a valid identifier. +-- +module GHC.Lexeme ( + -- * Lexical characteristics of Haskell names + startsVarSym, startsVarId, startsConSym, startsConId, + startsVarSymASCII, isVarSymChar + ) where + +import Data.Char + +startsVarSym, startsVarId, startsConSym, startsConId :: Char -> Bool +startsVarSym c = startsVarSymASCII c || (ord c > 0x7f && isSymbol c) -- Infix Ids +startsConSym c = c == ':' -- Infix data constructors +startsVarId c = c == '_' || case generalCategory c of -- Ordinary Ids + LowercaseLetter -> True + OtherLetter -> True -- See #1103 + _ -> False +startsConId c = isUpper c || c == '(' -- Ordinary type constructors and data constructors + +startsVarSymASCII :: Char -> Bool +startsVarSymASCII c = c `elem` "!#$%&*+./<=>?@\\^|~-" + +isVarSymChar :: Char -> Bool +isVarSymChar c = c == ':' || startsVarSym c diff --git a/libraries/ghc-boot/GHC/PackageDb.hs b/libraries/ghc-boot/GHC/PackageDb.hs new file mode 100644 index 0000000000..672b7ebbe3 --- /dev/null +++ b/libraries/ghc-boot/GHC/PackageDb.hs @@ -0,0 +1,411 @@ +{-# LANGUAGE CPP #-} +-- This module deliberately defines orphan instances for now (Binary Version). +{-# OPTIONS_GHC -fno-warn-orphans -fno-warn-name-shadowing #-} +----------------------------------------------------------------------------- +-- | +-- Module : GHC.PackageDb +-- Copyright : (c) The University of Glasgow 2009, Duncan Coutts 2014 +-- +-- Maintainer : ghc-devs@haskell.org +-- Portability : portable +-- +-- This module provides the view of GHC's database of registered packages that +-- is shared between GHC the compiler\/library, and the ghc-pkg program. It +-- defines the database format that is shared between GHC and ghc-pkg. +-- +-- The database format, and this library are constructed so that GHC does not +-- have to depend on the Cabal library. The ghc-pkg program acts as the +-- gateway between the external package format (which is defined by Cabal) and +-- the internal package format which is specialised just for GHC. +-- +-- GHC the compiler only needs some of the information which is kept about +-- registerd packages, such as module names, various paths etc. On the other +-- hand ghc-pkg has to keep all the information from Cabal packages and be able +-- to regurgitate it for users and other tools. +-- +-- The first trick is that we duplicate some of the information in the package +-- database. We essentially keep two versions of the datbase in one file, one +-- version used only by ghc-pkg which keeps the full information (using the +-- serialised form of the 'InstalledPackageInfo' type defined by the Cabal +-- library); and a second version written by ghc-pkg and read by GHC which has +-- just the subset of information that GHC needs. +-- +-- The second trick is that this module only defines in detail the format of +-- the second version -- the bit GHC uses -- and the part managed by ghc-pkg +-- is kept in the file but here we treat it as an opaque blob of data. That way +-- this library avoids depending on Cabal. +-- +module GHC.PackageDb ( + InstalledPackageInfo(..), + ExposedModule(..), + OriginalModule(..), + BinaryStringRep(..), + emptyInstalledPackageInfo, + readPackageDbForGhc, + readPackageDbForGhcPkg, + writePackageDb + ) where + +import Data.Version (Version(..)) +import qualified Data.ByteString as BS +import qualified Data.ByteString.Char8 as BS.Char8 +import qualified Data.ByteString.Lazy as BS.Lazy +import qualified Data.ByteString.Lazy.Internal as BS.Lazy (defaultChunkSize) +import Data.Binary as Bin +import Data.Binary.Put as Bin +import Data.Binary.Get as Bin +import Control.Exception as Exception +import Control.Monad (when) +import System.FilePath +import System.IO +import System.IO.Error +import GHC.IO.Exception (IOErrorType(InappropriateType)) +import System.Directory + + +-- | This is a subset of Cabal's 'InstalledPackageInfo', with just the bits +-- that GHC is interested in. +-- +data InstalledPackageInfo instpkgid srcpkgid srcpkgname pkgkey modulename + = InstalledPackageInfo { + installedPackageId :: instpkgid, + sourcePackageId :: srcpkgid, + packageName :: srcpkgname, + packageVersion :: Version, + packageKey :: pkgkey, + depends :: [instpkgid], + importDirs :: [FilePath], + hsLibraries :: [String], + extraLibraries :: [String], + extraGHCiLibraries :: [String], + libraryDirs :: [FilePath], + frameworks :: [String], + frameworkDirs :: [FilePath], + ldOptions :: [String], + ccOptions :: [String], + includes :: [String], + includeDirs :: [FilePath], + haddockInterfaces :: [FilePath], + haddockHTMLs :: [FilePath], + exposedModules :: [ExposedModule instpkgid modulename], + hiddenModules :: [modulename], + instantiatedWith :: [(modulename,OriginalModule instpkgid modulename)], + exposed :: Bool, + trusted :: Bool + } + deriving (Eq, Show) + +-- | An original module is a fully-qualified module name (installed package ID +-- plus module name) representing where a module was *originally* defined +-- (i.e., the 'exposedReexport' field of the original ExposedModule entry should +-- be 'Nothing'). Invariant: an OriginalModule never points to a reexport. +data OriginalModule instpkgid modulename + = OriginalModule { + originalPackageId :: instpkgid, + originalModuleName :: modulename + } + deriving (Eq, Show) + +-- | Represents a module name which is exported by a package, stored in the +-- 'exposedModules' field. A module export may be a reexport (in which +-- case 'exposedReexport' is filled in with the original source of the module), +-- and may be a signature (in which case 'exposedSignature is filled in with +-- what the signature was compiled against). Thus: +-- +-- * @ExposedModule n Nothing Nothing@ represents an exposed module @n@ which +-- was defined in this package. +-- +-- * @ExposedModule n (Just o) Nothing@ represents a reexported module @n@ +-- which was originally defined in @o@. +-- +-- * @ExposedModule n Nothing (Just s)@ represents an exposed signature @n@ +-- which was compiled against the implementation @s@. +-- +-- * @ExposedModule n (Just o) (Just s)@ represents a reexported signature +-- which was originally defined in @o@ and was compiled against the +-- implementation @s@. +-- +-- We use two 'Maybe' data types instead of an ADT with four branches or +-- four fields because this representation allows us to treat +-- reexports/signatures uniformly. +data ExposedModule instpkgid modulename + = ExposedModule { + exposedName :: modulename, + exposedReexport :: Maybe (OriginalModule instpkgid modulename), + exposedSignature :: Maybe (OriginalModule instpkgid modulename) + } + deriving (Eq, Show) + +class BinaryStringRep a where + fromStringRep :: BS.ByteString -> a + toStringRep :: a -> BS.ByteString + +emptyInstalledPackageInfo :: (BinaryStringRep a, BinaryStringRep b, + BinaryStringRep c, BinaryStringRep d) + => InstalledPackageInfo a b c d e +emptyInstalledPackageInfo = + InstalledPackageInfo { + installedPackageId = fromStringRep BS.empty, + sourcePackageId = fromStringRep BS.empty, + packageName = fromStringRep BS.empty, + packageVersion = Version [] [], + packageKey = fromStringRep BS.empty, + depends = [], + importDirs = [], + hsLibraries = [], + extraLibraries = [], + extraGHCiLibraries = [], + libraryDirs = [], + frameworks = [], + frameworkDirs = [], + ldOptions = [], + ccOptions = [], + includes = [], + includeDirs = [], + haddockInterfaces = [], + haddockHTMLs = [], + exposedModules = [], + hiddenModules = [], + instantiatedWith = [], + exposed = False, + trusted = False + } + +-- | Read the part of the package DB that GHC is interested in. +-- +readPackageDbForGhc :: (BinaryStringRep a, BinaryStringRep b, BinaryStringRep c, + BinaryStringRep d, BinaryStringRep e) => + FilePath -> IO [InstalledPackageInfo a b c d e] +readPackageDbForGhc file = + decodeFromFile file getDbForGhc + where + getDbForGhc = do + _version <- getHeader + _ghcPartLen <- get :: Get Word32 + ghcPart <- get + -- the next part is for ghc-pkg, but we stop here. + return ghcPart + +-- | Read the part of the package DB that ghc-pkg is interested in +-- +-- Note that the Binary instance for ghc-pkg's representation of packages +-- is not defined in this package. This is because ghc-pkg uses Cabal types +-- (and Binary instances for these) which this package does not depend on. +-- +readPackageDbForGhcPkg :: Binary pkgs => FilePath -> IO pkgs +readPackageDbForGhcPkg file = + decodeFromFile file getDbForGhcPkg + where + getDbForGhcPkg = do + _version <- getHeader + -- skip over the ghc part + ghcPartLen <- get :: Get Word32 + _ghcPart <- skip (fromIntegral ghcPartLen) + -- the next part is for ghc-pkg + ghcPkgPart <- get + return ghcPkgPart + +-- | Write the whole of the package DB, both parts. +-- +writePackageDb :: (Binary pkgs, BinaryStringRep a, BinaryStringRep b, + BinaryStringRep c, BinaryStringRep d, BinaryStringRep e) => + FilePath -> [InstalledPackageInfo a b c d e] -> pkgs -> IO () +writePackageDb file ghcPkgs ghcPkgPart = + writeFileAtomic file (runPut putDbForGhcPkg) + where + putDbForGhcPkg = do + putHeader + put ghcPartLen + putLazyByteString ghcPart + put ghcPkgPart + where + ghcPartLen :: Word32 + ghcPartLen = fromIntegral (BS.Lazy.length ghcPart) + ghcPart = encode ghcPkgs + +getHeader :: Get (Word32, Word32) +getHeader = do + magic <- getByteString (BS.length headerMagic) + when (magic /= headerMagic) $ + fail "not a ghc-pkg db file, wrong file magic number" + + majorVersion <- get :: Get Word32 + -- The major version is for incompatible changes + + minorVersion <- get :: Get Word32 + -- The minor version is for compatible extensions + + when (majorVersion /= 1) $ + fail "unsupported ghc-pkg db format version" + -- If we ever support multiple major versions then we'll have to change + -- this code + + -- The header can be extended without incrementing the major version, + -- we ignore fields we don't know about (currently all). + headerExtraLen <- get :: Get Word32 + skip (fromIntegral headerExtraLen) + + return (majorVersion, minorVersion) + +putHeader :: Put +putHeader = do + putByteString headerMagic + put majorVersion + put minorVersion + put headerExtraLen + where + majorVersion = 1 :: Word32 + minorVersion = 0 :: Word32 + headerExtraLen = 0 :: Word32 + +headerMagic :: BS.ByteString +headerMagic = BS.Char8.pack "\0ghcpkg\0" + + +-- TODO: we may be able to replace the following with utils from the binary +-- package in future. + +-- | Feed a 'Get' decoder with data chunks from a file. +-- +decodeFromFile :: FilePath -> Get a -> IO a +decodeFromFile file decoder = + withBinaryFile file ReadMode $ \hnd -> + feed hnd (runGetIncremental decoder) + where + feed hnd (Partial k) = do chunk <- BS.hGet hnd BS.Lazy.defaultChunkSize + if BS.null chunk + then feed hnd (k Nothing) + else feed hnd (k (Just chunk)) + feed _ (Done _ _ res) = return res + feed _ (Fail _ _ msg) = ioError err + where + err = mkIOError InappropriateType loc Nothing (Just file) + `ioeSetErrorString` msg + loc = "GHC.PackageDb.readPackageDb" + +-- Copied from Cabal's Distribution.Simple.Utils. +writeFileAtomic :: FilePath -> BS.Lazy.ByteString -> IO () +writeFileAtomic targetPath content = do + let (targetDir, targetFile) = splitFileName targetPath + Exception.bracketOnError + (openBinaryTempFileWithDefaultPermissions targetDir $ targetFile <.> "tmp") + (\(tmpPath, handle) -> hClose handle >> removeFile tmpPath) + (\(tmpPath, handle) -> do + BS.Lazy.hPut handle content + hClose handle + renameFile tmpPath targetPath) + +instance (BinaryStringRep a, BinaryStringRep b, BinaryStringRep c, + BinaryStringRep d, BinaryStringRep e) => + Binary (InstalledPackageInfo a b c d e) where + put (InstalledPackageInfo + installedPackageId sourcePackageId + packageName packageVersion packageKey + depends importDirs + hsLibraries extraLibraries extraGHCiLibraries libraryDirs + frameworks frameworkDirs + ldOptions ccOptions + includes includeDirs + haddockInterfaces haddockHTMLs + exposedModules hiddenModules instantiatedWith + exposed trusted) = do + put (toStringRep installedPackageId) + put (toStringRep sourcePackageId) + put (toStringRep packageName) + put packageVersion + put (toStringRep packageKey) + put (map toStringRep depends) + put importDirs + put hsLibraries + put extraLibraries + put extraGHCiLibraries + put libraryDirs + put frameworks + put frameworkDirs + put ldOptions + put ccOptions + put includes + put includeDirs + put haddockInterfaces + put haddockHTMLs + put exposedModules + put (map toStringRep hiddenModules) + put (map (\(k,v) -> (toStringRep k, v)) instantiatedWith) + put exposed + put trusted + + get = do + installedPackageId <- get + sourcePackageId <- get + packageName <- get + packageVersion <- get + packageKey <- get + depends <- get + importDirs <- get + hsLibraries <- get + extraLibraries <- get + extraGHCiLibraries <- get + libraryDirs <- get + frameworks <- get + frameworkDirs <- get + ldOptions <- get + ccOptions <- get + includes <- get + includeDirs <- get + haddockInterfaces <- get + haddockHTMLs <- get + exposedModules <- get + hiddenModules <- get + instantiatedWith <- get + exposed <- get + trusted <- get + return (InstalledPackageInfo + (fromStringRep installedPackageId) + (fromStringRep sourcePackageId) + (fromStringRep packageName) packageVersion + (fromStringRep packageKey) + (map fromStringRep depends) + importDirs + hsLibraries extraLibraries extraGHCiLibraries libraryDirs + frameworks frameworkDirs + ldOptions ccOptions + includes includeDirs + haddockInterfaces haddockHTMLs + exposedModules + (map fromStringRep hiddenModules) + (map (\(k,v) -> (fromStringRep k, v)) instantiatedWith) + exposed trusted) + +instance Binary Version where + put (Version a b) = do + put a + put b + get = do + a <- get + b <- get + return (Version a b) + +instance (BinaryStringRep a, BinaryStringRep b) => + Binary (OriginalModule a b) where + put (OriginalModule originalPackageId originalModuleName) = do + put (toStringRep originalPackageId) + put (toStringRep originalModuleName) + get = do + originalPackageId <- get + originalModuleName <- get + return (OriginalModule (fromStringRep originalPackageId) + (fromStringRep originalModuleName)) + +instance (BinaryStringRep a, BinaryStringRep b) => + Binary (ExposedModule a b) where + put (ExposedModule exposedName exposedReexport exposedSignature) = do + put (toStringRep exposedName) + put exposedReexport + put exposedSignature + get = do + exposedName <- get + exposedReexport <- get + exposedSignature <- get + return (ExposedModule (fromStringRep exposedName) + exposedReexport + exposedSignature) diff --git a/libraries/ghc-boot/LICENSE b/libraries/ghc-boot/LICENSE new file mode 100644 index 0000000000..b5059b71f6 --- /dev/null +++ b/libraries/ghc-boot/LICENSE @@ -0,0 +1,31 @@ +The Glasgow Haskell Compiler License + +Copyright 2002, The University Court of the University of Glasgow. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +- Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. + +- Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +- Neither name of the University nor the names of its contributors may be +used to endorse or promote products derived from this software without +specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE UNIVERSITY COURT OF THE UNIVERSITY OF +GLASGOW AND THE CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, +INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND +FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE +UNIVERSITY COURT OF THE UNIVERSITY OF GLASGOW OR THE CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH +DAMAGE. diff --git a/libraries/ghc-boot/ghc-boot.cabal b/libraries/ghc-boot/ghc-boot.cabal new file mode 100644 index 0000000000..98929b7f83 --- /dev/null +++ b/libraries/ghc-boot/ghc-boot.cabal @@ -0,0 +1,45 @@ +name: ghc-boot +version: 0.0.0.0 +license: BSD3 +maintainer: ghc-devs@haskell.org +bug-reports: glasgow-haskell-bugs@haskell.org +synopsis: Shared functionality between GHC and its boot libraries +description: This library is shared between GHC, ghc-pkg, and other boot + libraries. + . + A note about "GHC.PackageDb": it only deals with the subset of + the package database that the compiler cares about: modules + paths etc and not package metadata like description, authors + etc. It is thus not a library interface to ghc-pkg and is *not* + suitable for modifying GHC package databases. + . + The package database format and this library are constructed in + such a way that while ghc-pkg depends on Cabal, the GHC library + and program do not have to depend on Cabal. +cabal-version: >=1.10 +build-type: Simple + +source-repository head + type: git + location: http://git.haskell.org/ghc.git + subdir: libraries/ghc-boot + +Library + default-language: Haskell2010 + other-extensions: + GeneralizedNewtypeDeriving + RecordWildCards + StandaloneDeriving + Trustworthy + TypeSynonymInstances + + exposed-modules: + GHC.Lexeme + GHC.PackageDb + + build-depends: base >= 4 && < 5, + binary >= 0.7 && < 0.8, + bytestring >= 0.9 && < 1, + directory >= 1 && < 1.3, + filepath + |